OSDir


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: PSA: Make sure your Airflow instance isn't public and isn't Google indexed


I think that a banner notification would be a fair penalty if you access
Airflow without authentication, or have API authentication turned off, or
are accessing via http:// with a non-localhost `Host:`. (Are there any
other circumstances to think of?)

I would also suggest serving a default robots.txt to mitigate accidental
indexing of public instances (as most public instances will be accidentally
public, statistically speaking). If you truly want your Airflow instance
public and indexed, you should have to go out of your way to permit that.

On Tue, Jun 5, 2018 at 1:51 PM, Maxime Beauchemin <
maximebeauchemin@xxxxxxxxx> wrote:

> What about a clear alert on the UI showing when auth is off? Perhaps a
> large red triangle-exclamation icon on the navbar with a tooltip
> "Authentication is off, this Airflow instance in not secure." and clicking
> take you to the doc's security page.
>
> Well and then of course people should make sure their infra isn't open to
> the Internet. We really shouldn't have to tell people to keep their
> infrastructure behind a firewall. In most environments you have to do quite
> a bit of work to open any resource up to the Internet (SSL certs, special
> security groups for load balancers/proxies, ...). Now I'm curious to
> understand how UMG managed to do this by mistake...
>
> Also a quick reminder to use the Connection abstraction to store secrets,
> ideally using the environment variable feature.
>
> Max
>
> On Tue, Jun 5, 2018 at 10:02 AM Taylor Edmiston <tedmiston@xxxxxxxxx>
> wrote:
>
> > One of our engineers wrote a blog post about the UMG mistakes as well.
> >
> > https://www.astronomer.io/blog/universal-music-group-airflow-leak/
> >
> > I know that best practices are well known here, but I second James'
> > suggestion that we add some docs, code, or config so that the framework
> > optimizes for being (nearly) production-ready by default and not just
> easy
> > to start with for local dev.  Admittedly this takes some work to not add
> > friction to the local onboarding experience.
> >
> > Do most people keep separate airflow.cfg files per environment like
> what's
> > considered the best practice in the Django world?  e.g.
> > https://stackoverflow.com/q/10664244/149428
> >
> > Taylor
> >
> > *Taylor Edmiston*
> > Blog <https://blog.tedmiston.com/> | CV
> > <https://stackoverflow.com/cv/taylor> | LinkedIn
> > <https://www.linkedin.com/in/tedmiston/> | AngelList
> > <https://angel.co/taylor> | Stack Overflow
> > <https://stackoverflow.com/users/149428/taylor-edmiston>
> >
> >
> > On Tue, Jun 5, 2018 at 9:57 AM, James Meickle <jmeickle@xxxxxxxxxxxxxx>
> > wrote:
> >
> > > Bumping this one because now Airflow is in the news over it...
> > >
> > > https://www.bleepingcomputer.com/news/security/contractor-
> > > exposes-credentials-for-universal-music-groups-it-
> > > infrastructure/?utm_campaign=Security%2BNewsletter&utm_
> > > medium=email&utm_source=Security_Newsletter_co_79
> > >
> > > On Fri, Mar 23, 2018 at 9:33 AM, James Meickle <
> jmeickle@xxxxxxxxxxxxxx>
> > > wrote:
> > >
> > > > While Googling something Airflow-related a few weeks ago, I noticed
> > that
> > > > someone's Airflow dashboard had been indexed by Google and was
> > accessible
> > > > to the outside world without authentication. A little more Googling
> > > > revealed a handful of other indexed instances in various states of
> > > > security. I did my best to contact the operators, and waited for
> > > responses
> > > > before posting this.
> > > >
> > > > Airflow is not a secure project by default (
> https://issues.apache.org/
> > > > jira/browse/AIRFLOW-2047), and you can do all sorts of mean things to
> > an
> > > > instance that hasn't been intentionally locked down. (And even then,
> > you
> > > > shouldn't rely exclusively on your app's authentication for providing
> > > > security.)
> > > >
> > > > Having "internal" dashboards/data sources/executors exposed to the
> web
> > is
> > > > dangerous, since old versions can stick around for a very long time,
> > help
> > > > compromise unrelated deployments, and generally just create very bad
> > > press
> > > > for the overall project if there's ever a mass compromise (see: Redis
> > and
> > > > MongoDB).
> > > >
> > > > Shipping secure defaults is hard, but perhaps we could add best
> > practices
> > > > like instructions for deploying a robots.txt with Airflow? Or an
> impact
> > > > statement about what someone could do if they access your Airflow
> > > instance?
> > > > I think that many people deploying Airflow for the first time might
> not
> > > > realize that it can get indexed, or how much damage someone can cause
> > via
> > > > accessing it.
> > > >
> > >
> >
>