OSDir


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: PSA: Make sure your Airflow instance isn't public and isn't Google indexed


I suggest reading the section on password complexity here
https://pages.nist.gov/800-63-3/sp800-63b.html which recommends just a
minimum length and a check against a list of the most common passwords.

On Tue, Jun 5, 2018 at 3:14 PM, Maxime Beauchemin <
maximebeauchemin@xxxxxxxxx> wrote:

> Agreed, secured by default is ideal. Though I wouldn't want people to get
> an unreasonable sense of safety and open their instance to the web.
>
> I like the idea of generating a temporary key/token and exposing it in the
> console where the process was started. Other option is to use the
> database/password mechanism by default and add a `airflow create-user
> --admin`  CLI command to generate a user. With the level of cluelessness
> we're observing we should probably force a certain password complexity
> level.
>
> We should also state clearly in the docs that Airflow is not regularly
> pen-tested and should not be exposed to the Internet.
>
> For the record we had Airflow pen-tested at Airbnb by a third party in 2016
> (or was it 2017?) and found/resolved half a dozen or so vulnerabilities or
> so. Note that there's no recurring process in place, or any mechanisms to
> prevent regressions beyond code review. Also note that the new [beta in
> 1.10] UI has not been pen tested (to my knowledge).
>
> Max
>
> On Tue, Jun 5, 2018 at 2:48 PM Bolke de Bruin <bdbruin@xxxxxxxxx> wrote:
>
> > Tbh I like to go to a setup where it is secure by default. Airflow is
> > getting more and more used so it also increases the attack surface. If
> you
> > run “initdb” or “resetdb” it is easy to provide a generated password.
> >
> > I don’t see a reason anymore for having a unsecured version.
> >
> > B.
> >
> > Verstuurd vanaf mijn iPad
> >
> > > Op 5 jun. 2018 om 23:11 heeft Christopher Bockman <
> chris@xxxxxxxxxxxxxxx>
> > het volgende geschreven:
> > >
> > > +1 to being able to disable--we have authentication in place, but use a
> > > separate solution that (probably?) Airflow won't realize is enabled, so
> > > having a continuous giant warning banner would be rather unfortunate.
> > >
> > >> On Tue, Jun 5, 2018 at 2:05 PM, Alek Storm <alek.storm@xxxxxxxxx>
> > wrote:
> > >>
> > >> This is a great idea, but we'd appreciate a setting that disables the
> > >> banner even if those conditions aren't met - our instance is deployed
> > >> without authentication, but is only accessible via our intranet.
> > >>
> > >> Alek
> > >>
> > >>
> > >> On Tue, Jun 5, 2018, 3:35 PM James Meickle <jmeickle@xxxxxxxxxxxxxx>
> > >> wrote:
> > >>
> > >>> I think that a banner notification would be a fair penalty if you
> > access
> > >>> Airflow without authentication, or have API authentication turned
> off,
> > or
> > >>> are accessing via http:// with a non-localhost `Host:`. (Are there
> any
> > >>> other circumstances to think of?)
> > >>>
> > >>> I would also suggest serving a default robots.txt to mitigate
> > accidental
> > >>> indexing of public instances (as most public instances will be
> > >> accidentally
> > >>> public, statistically speaking). If you truly want your Airflow
> > instance
> > >>> public and indexed, you should have to go out of your way to permit
> > that.
> > >>>
> > >>> On Tue, Jun 5, 2018 at 1:51 PM, Maxime Beauchemin <
> > >>> maximebeauchemin@xxxxxxxxx> wrote:
> > >>>
> > >>>> What about a clear alert on the UI showing when auth is off?
> Perhaps a
> > >>>> large red triangle-exclamation icon on the navbar with a tooltip
> > >>>> "Authentication is off, this Airflow instance in not secure." and
> > >>> clicking
> > >>>> take you to the doc's security page.
> > >>>>
> > >>>> Well and then of course people should make sure their infra isn't
> open
> > >> to
> > >>>> the Internet. We really shouldn't have to tell people to keep their
> > >>>> infrastructure behind a firewall. In most environments you have to
> do
> > >>> quite
> > >>>> a bit of work to open any resource up to the Internet (SSL certs,
> > >> special
> > >>>> security groups for load balancers/proxies, ...). Now I'm curious to
> > >>>> understand how UMG managed to do this by mistake...
> > >>>>
> > >>>> Also a quick reminder to use the Connection abstraction to store
> > >> secrets,
> > >>>> ideally using the environment variable feature.
> > >>>>
> > >>>> Max
> > >>>>
> > >>>> On Tue, Jun 5, 2018 at 10:02 AM Taylor Edmiston <
> tedmiston@xxxxxxxxx>
> > >>>> wrote:
> > >>>>
> > >>>>> One of our engineers wrote a blog post about the UMG mistakes as
> > >> well.
> > >>>>>
> > >>>>> https://www.astronomer.io/blog/universal-music-group-airflow-leak/
> > >>>>>
> > >>>>> I know that best practices are well known here, but I second James'
> > >>>>> suggestion that we add some docs, code, or config so that the
> > >> framework
> > >>>>> optimizes for being (nearly) production-ready by default and not
> just
> > >>>> easy
> > >>>>> to start with for local dev.  Admittedly this takes some work to
> not
> > >>> add
> > >>>>> friction to the local onboarding experience.
> > >>>>>
> > >>>>> Do most people keep separate airflow.cfg files per environment like
> > >>>> what's
> > >>>>> considered the best practice in the Django world?  e.g.
> > >>>>> https://stackoverflow.com/q/10664244/149428
> > >>>>>
> > >>>>> Taylor
> > >>>>>
> > >>>>> *Taylor Edmiston*
> > >>>>> Blog <https://blog.tedmiston.com/> | CV
> > >>>>> <https://stackoverflow.com/cv/taylor> | LinkedIn
> > >>>>> <https://www.linkedin.com/in/tedmiston/> | AngelList
> > >>>>> <https://angel.co/taylor> | Stack Overflow
> > >>>>> <https://stackoverflow.com/users/149428/taylor-edmiston>
> > >>>>>
> > >>>>>
> > >>>>> On Tue, Jun 5, 2018 at 9:57 AM, James Meickle <
> > >> jmeickle@xxxxxxxxxxxxxx
> > >>>>
> > >>>>> wrote:
> > >>>>>
> > >>>>>> Bumping this one because now Airflow is in the news over it...
> > >>>>>>
> > >>>>>> https://www.bleepingcomputer.com/news/security/contractor-
> > >>>>>> exposes-credentials-for-universal-music-groups-it-
> > >>>>>> infrastructure/?utm_campaign=Security%2BNewsletter&utm_
> > >>>>>> medium=email&utm_source=Security_Newsletter_co_79
> > >>>>>>
> > >>>>>> On Fri, Mar 23, 2018 at 9:33 AM, James Meickle <
> > >>>> jmeickle@xxxxxxxxxxxxxx>
> > >>>>>> wrote:
> > >>>>>>
> > >>>>>>> While Googling something Airflow-related a few weeks ago, I
> > >> noticed
> > >>>>> that
> > >>>>>>> someone's Airflow dashboard had been indexed by Google and was
> > >>>>> accessible
> > >>>>>>> to the outside world without authentication. A little more
> > >> Googling
> > >>>>>>> revealed a handful of other indexed instances in various states
> > >> of
> > >>>>>>> security. I did my best to contact the operators, and waited for
> > >>>>>> responses
> > >>>>>>> before posting this.
> > >>>>>>>
> > >>>>>>> Airflow is not a secure project by default (
> > >>>> https://issues.apache.org/
> > >>>>>>> jira/browse/AIRFLOW-2047), and you can do all sorts of mean
> > >> things
> > >>> to
> > >>>>> an
> > >>>>>>> instance that hasn't been intentionally locked down. (And even
> > >>> then,
> > >>>>> you
> > >>>>>>> shouldn't rely exclusively on your app's authentication for
> > >>> providing
> > >>>>>>> security.)
> > >>>>>>>
> > >>>>>>> Having "internal" dashboards/data sources/executors exposed to
> > >> the
> > >>>> web
> > >>>>> is
> > >>>>>>> dangerous, since old versions can stick around for a very long
> > >>> time,
> > >>>>> help
> > >>>>>>> compromise unrelated deployments, and generally just create very
> > >>> bad
> > >>>>>> press
> > >>>>>>> for the overall project if there's ever a mass compromise (see:
> > >>> Redis
> > >>>>> and
> > >>>>>>> MongoDB).
> > >>>>>>>
> > >>>>>>> Shipping secure defaults is hard, but perhaps we could add best
> > >>>>> practices
> > >>>>>>> like instructions for deploying a robots.txt with Airflow? Or an
> > >>>> impact
> > >>>>>>> statement about what someone could do if they access your Airflow
> > >>>>>> instance?
> > >>>>>>> I think that many people deploying Airflow for the first time
> > >> might
> > >>>> not
> > >>>>>>> realize that it can get indexed, or how much damage someone can
> > >>> cause
> > >>>>> via
> > >>>>>>> accessing it.
> > >>>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>
> > >>
> >
>