Scanning the Alexa top 1M sites for Dockerfiles


Recently I stumbled over a site which publicly served their Dockerfile. That particular instance wasn’t very interesting. But I started to wonder how widespread this is and what sites are exposing due to that.

By all means, this isn’t exactly new. You can find /Dockerfile in the SecLists repository for a while.
However, it seems that so far nobody (publicly) investigated this. I’m also still operating a bunch of sites that are in the top 1 million list and I couldn’t find a single request for this file in my (limited) log files.

So I’ve started to do my own scan of the Alexa top 1 Million sites list.
This work was heavily inspired by the research of Hanno Böck in the past and in particular I used his wonderful tool snallygaster to conduct most of the scans. Thanks Hanno!

What is a Dockerfile?

A Dockerfile is the blueprint of a container. It contains all commands needed to build it. It is a simple plaintext file. You can tell Docker to copy files into the container, expose network ports and of course run any command during the build, for example:

FROM nginx

COPY default.conf /etc/nginx/conf.d/default.conf

COPY html/ /usr/share/nginx/html

RUN echo " mysql" >> /etc/hosts


Basically you describe exactly how the container is configured, which packages are installed and what commands are being ran in the process of building it.

As you can see it doesn’t necessarily contain sensitive information. In the above example we don’t even see which files are copied to the NGINX document root.


Out of the 1’000’000 sites 659 served a Dockerfile.
There is large reuse of existing Dockerfiles, one in particular was used 105 times.
Overall this boils down to 338 unique Dockerfiles being served.

41 were used two times or more, in detail:

The remaining 298 were uniquely used by only one site.

Most of them did fairly innocent operations that didn’t tell us much such as:

Not much there that we couldn’t also figure out by looking at the site directly.

A lot of them gave us a very detailed view of what is probably running on the server, e.g.:

It’s nice to know exactly which PHP modules are used on the server, this might be useful in some cases.

But as I dug deeper I found sometimes not only the Dockerfile was exposed but also much of the referenced configuration files. For example in the Dockerfile “docker/nginx.conf” is copied:

Which we then can simply try to access like this:

Somewhat common in that scenario are TLS certificates and, well, keys. I’ve found around 10 of those, for example:

And some people simply do insane things in their Dockerfile, like exposing their AWS secret key:

Or using a tool called “sshpass” to pipe a password into ssh to automate a rsync:

And at least one SSH root key is being served:

Overall I found SSH keys, npm tokens, TLS keys, passwords, AWS secrets, Amazon SES credentials, countless configuration files and source code of some of the applications.

These are of course the extreme examples which are to be expected on such a wide range scan.

How does this happen?

By default the Dockerfile is not copied into the container and certainly not to a publicly served folder.

From what I can tell the mistake that most of these sites make is practically this (real example from the scan):

With the first COPY line they copy everything in the current folder to a publicly served folder.
Afterwards configuration files get copied.

With this both the nginx.conf and the complete ssl directory are public. We can now simply fetch the nginx.conf, lookup the name of the certificate and key files and then fetch those as well.

In some cases there was no such COPY command. I can only guess that the files ended up due to another mistake in the document root, possibly unrelated to Docker.


With only 0.066 % of sites exposing a Dockerfile this doesn’t look like a very widespread problem. And on top of that only a subset of those – less than 100 – expose really critical information that can lead to a compromise.

But in any case, it rarely makes sense to publicly serve a Dockerfile.
Even if you don’t include any keys, passwords or other secrets: It still doesn’t make sense to give everyone a blueprint of your system.
The sites that don’t expose anything critical right now might start in the future when changes are made to this seemingly private file.

It’s generally good advice – even if you don’t use Docker – to simply check your public webroot folder for any files that shouldn’t be there and remove them.


XSS on

I found a vulnerability on which allowed me to store XSS on their module page for a module I own.
User interaction was still required to execute the JavaScript payload by hovering over a link on the page, thus the risk was rather limited.

The issue was that not all values in metadata.json of uploaded modules were correctly sanitized. You could upload a module with the following metadata.json payload (abbreviated):

  "operatingsystem_support": [
      "operatingsystemrelease":[ "5", "6", "7<script>alert('xss')</script>" ]

When a user then visited the module page and hovered over the “CentOS” link, to figure out which versions are supported, then the JavaScript payload would be executed:

This issue has been fixed by the Puppet team.

2018-03-24 – Issue was reported to the Puppet security team.
2018-04-01 – Asking for feedback if the report has been received.
2018-04-01 – Puppet security team confirms and says it’s added to their backlog.
2018-06-13 – Asking for feedback if the issue is resolved.
2018-06-13 – Puppet security team confirms it’s fixed, possibly already since March.


iOS camera QR code URL parser bug

I’ve learned recently that the iOS 11 camera app will now automatically scan QR codes and interpret them.
This is pretty cool, until now you needed special apps to do that for you on iOS.
When scanning a QR code which contains a URL – in this case –  iOS will show a notification like this:

Naturally the first thing I want to try is to construct a QR code which will show an unsuspicious hostname in the notification but then open another URL in Safari.

And this is exactly what I found after a few minutes. Here it is in action:

There is no redirect misuse being done on, Safari will only access


If you scan this QR code with the iOS (11.2.1) camera app:

The URL embedded in the QR code is:

It will show this notification:

But if you tap it to open the site, it will instead open

The URL parser of the camera app has a problem here detecting the hostname in this URL in the same way as Safari does.
It probably detects “xxx\” as the username to be sent to “”.
While Safari might take the complete string “xxx\” as a username and “443” as the password to be sent to
This leads to a different hostname being displayed in the notification compared to what actually is opened in Safari.

This issue has been reported to the Apple security team on 2017-12-23.
As of today (2018-03-24) this is still not fixed.

On 2018-04-24 this has been fixed with iOS 11.3.1 and macOS 10.13.4.
CVE-2018-4187 has been assigned to both issues.


Stored XSS in Foreman

Following up a bit on my recent post “Looking at public Puppet servers” I was wondering how an attacker could extend his rights within the Puppet ecosystem especially when a system like Foreman is used. Cross site scripting could be useful for this, gaining access to Foreman would allow an attacker basically to compromise everything.

I’ve focused first on facts. Facts are generated by the local system and can be overwritten given enough permissions. Displaying facts in the table seemed to be secured sufficiently, however there is another function on the /fact_values page: Showing an distribution graph of a specific fact.

When the graph is displayed HTML tags are not removed from facts and XSS is possible. Both in the fact name (as a header in the chart) and fact value (in the legend of the chart).

For example, add two new facts by running:

mkdir -p /etc/facter/facts.d/
cat << EOF >> /etc/facter/facts.d/xss.yaml
aaa_test_fact<script>alert(1)</script>: xxx
aab_test_fact: x<script>alert(1)</script>xx

It will shop up like this in the global /fact_values page:

Clicking on the “Show distribution chart” action on either of those facts will execute the provided alert(1) JavaScript:

That’s fun but not really useful, tricking someone to click on the distribution chart of such a weird fact seems impractical.

But since the XSS is in the value of the fact we can just overwrite more interesting facts on that node and hope that an Administrator wants to see the distribution of that fact. For example, let’s add this to xss.yaml:

kernelversion: x><script>alert(1)</script>xx

Now if an Administrator wants to know the distribution of kernel versions in his environment and he uses this chart feature on any host the alert(1) JavaScript will get executed. This is what any other node will look like:

And after navigating to the kernelversion distribution chart on that page:

Still some interaction needed. I’ve noticed that on the general /statistics page the same graphs are used and facts like “manufacturer” are used in them. Unlike the other graphs these do not have a legend. But when you hover over a portion of the graph you’ll get a tooltip with the fact value. This is again vulnerable to XSS. For example add to xss.yaml:

manufacturer: x<img src='/' onerror='alert(1)'>x

Now when you visit the /statistics page and move the mouse over the hardware graph, the alert(1) will execute:

Still needs interaction. But if you inject a value into all the graphs it may not take long for an Administrator to hover over one of those.

However: By default Foreman uses CSP. Stealing someones session with this setup is not easily possible. So my initial plan to steal an Administrators Foreman session failed in the end.

This was tested on Foreman 1.15.6 and reported to the Foreman security team on 2017-10-31.
CVE-2017-15100 has been assigned to this issue.
A fix is already implemented and will be released with version 1.16.0.


Looking at public Puppet servers

This is about research I did a while ago but only now had time to finally write about.

In the beginning of this year I was curious to see how many Puppet 3 servers – back then freshly end of life – were connected directly to the internet:

If you don’t know Puppet: It’s a configuration management system that contains information to deploy systems and services in your infrastructure. You write code which defines how a system should be configured e.g. which software to install, which users to deploy, how a service is configured, etc.
It typically uses a client-server model, the clients periodically pull the configuration (“catalog”) from their configured server and apply it locally. Everything is transferred over TLS encrypted connections.

Puppet uses TLS client certificates to authenticate nodes. When a client (“puppet agent”) connects for the first time to a server it will generate a key locally and submit a certificate signing request to the server.
A operator needs to sign the certificate and from that point on the agent can pull its configuration from the server.

However it’s possible to configure Puppet server to simply automatically sign all incoming CSRs.
This is obviously not recommended to do unless you want anyone to get possibly sensitive information about your infrastructure. The Puppet documentation mentions several times that this is insecure:

First I was interested if anyone is already looking for servers configured like this.
I’ve setup a honeypot Puppet server with auto-signing enabled and waited.
But after months there still was not a single CSR submitted, only port scanners tried to connect to it.

I’ve decided to look for those servers myself.
With a small script I looped over around 2500 still online servers – there were still more but due to time constraints I only checked 2500. I’ve built a system which connects to all of those systems and submits a CSR. The “certname” – this is what operators will see when reviewing CSRs – was always the same and it was pretty obvious that it is not a legitimate request.
An attacker would do more recon, get the FQDNs of the Puppet server from its certificate and try to guess a more likely name.

Out of those 2500 servers:
89 immediately signed our certificate.

Out of those 89:
50 compiled a valid catalog that could have been applied directly.
39 tried to compile a catalog and failed with issues that could potentially be worked around on the client but no time was spent on that.

It is a normal setup to have a default role. If a unknown node connects it may get only this configuration applied. Usually this deploys the Administrator user accounts and sets generic settings. This happened a lot and that is already problematic since the system accounts also get their password hash configured which now could be brute-forced. And a lot of them also conveniently send along their sudoers configuration. An attacker could target higher privileged accounts directly.
But some of those servers assigned more generic roles automatically. Exposing root SSH keys, AWS credentials, configuration files of their services (Nginx, PostgreSQL, MySQL, …), tried to copy source code to it, passwords / password hashes of database and other service users:

And more. There are systems that could be immediately compromised with the information that they leak. The others at least tell attackers a lot about the system which makes further attacks much easier.

Here is where it gets interesting: One day later I connected again to the same 2500 servers using the same keys from the day before trying to retrieve a catalog. Normally we’d expect the number now to be stable or slightly less. But in this case it’s not:

145 servers allowed us to connect.
58 gave us a valid catalog that could be applied.

And one week later:
159 servers allowed us to connect.
63 gave us a valid catalog that could be applied.

Those servers were not offline during the first round either. They simply signed my CSR in the meantime without noticing that it’s none of their requests or I suspect that this is the combination of the following two items:

1) Generally when you are working with Puppet certificates you’ll be using “puppet cert $sub-command” to handle the TLS operations for you.

The problem is, that there is no way to tell it to simply reject a signing request.
You can “revoke” or “clean” certificates but it has to be signed first (see:
There may have been an option using the “puppet ca” command, but it has been deprecated and pretty much every documentation only mentions “puppet cert” nowadays.

You are left with these options:

  • Manually figure out where the .csr file is stored on the server and remove it
  • Use the deprecated tool “puppet ca destroy $certname”
  • Run “puppet cert sign $certname && puppet cert clean $certname”
  • Run “puppet cert sign $certname && puppet cert revoke $certname”

Some are unfortunately using the last two options.

There is the possibility that on some of those systems my certificate request was signed if only for a few seconds. Which brings us to the next issue:

2) Puppet server or in some cases the reverse proxy in front of it will only read the certificate revocation list at startup time.
If you don’t restart these services after revoking a certificate, it will still be allowed to connect.

“puppet cert revoke $certname” basically only adds the certificate to the CRL. It does not remove the signed certificate from the server.
I suspect that some operators have signed and revoked my certificate but haven’t restarted the service afterwards.

On the other hand “puppet cert clean $certname” will additionally remove the signed certificate from the server, when my client connects later it cannot get the signed certificate and is locked out.
This isn’t perfect either. If the client constantly requests the certificate it could retrieve it before the “clean” ran, but it is far better than only using “revoke”.

Depending on how you use Puppet, it may be one of your most critical systems containing the keys to your kingdom. There are very few cases where it makes sense to expose a Puppet server directly to the internet.
Those systems should be the best protected systems in your infrastructure, a compromised Puppet server essentially compromises all clients that are connecting to it.

Basic or naïve auto-signing should not be used in a production environment unless you can ensure that only trusted clients can reach the Puppet server. Only policy-based auto-signing or manual signing is secure otherwise.

XSS in ownCloud 2

A few weeks ago ownCloud 4.0.0 was released and it included some cool features like encryption of uploaded files. I decided to take it for a spin again.

I found again some XSS vulnerabilities. As last time, I reported these issues to the ownCloud team which responded quickly and fixed them already (with version 4.0.2). As far as I can tell CVE-2012-4396 was assigned to these issues (and others were merged into it as well).

1) change ID3 title tag of a MP3 file to: “Kalimba<script>alert(1)</script>”
2) upload
3) play it in the integrated player, JS gets executed

Now this is fun! Imagine someone sending you a MP3 which you listen to with ownCloud and in the background your cookies are sent to a remote system. If you run a ownCloud instance with multiple users, you can also share those files. It might be enough to listen to a shared MP3 to get your account compromised, I didn’t verify this though.

1) upload picture e.g. trollface.jpg
2) rename picture to “trollf<style onload=alert(1)>ace.jpg”
3) view the picture, JS gets executed

Can’t think of a good scenario for this to be useful. Maybe sharing this file.

1) add new appointment, title: “XSS <script>alert(1);</script>”
2) switch calendar view to “list”, JS gets executed

This was a bit surprising as the normal calendar view was not affected, only the list view.

XSS in ownCloud

A few weeks ago there was a bit of a hype about ownCloud when they released version 3.0.1. I decided to give it a spin, here is what I found.

Note: I contacted the development team earlier and these vulnerabilities have been fixed in the meantime with version 3.0.2, although I have not confirmed this myself due to lack of time.

XSS in files/download.php

The attacker can send an URL to the victim and JavaScript will be executed in the victims session. The attacker does not need an account on the ownCloud instance, only knowledge about the URL path:


XSS in files/index.php

If you share your ownCloud instance with multiple users, the attacker can send an URL to the victim and JavaScript will be executed in the victims session. Both the attacker and victim need accounts on the same instance.

Here is how:

1) Create a new folder on http://localhost/owncloud/files/index.php – any name will do, I used “PoC”

2) Share this folder with your victim or the victims group

3) Switch to http://localhost/owncloud/files/index.php?dir=/PoC

4) Create a folder, called:

x"> <body onload=alert(1)><x="

5) Send that link to your victim:


6) ???

7) Profit!

It may be possible to create the folder directly in /, however I couldn’t get that folder shared with other users. But since it gets automatically shared if the parent folder is shared, I didn’t invest much time into that.

XSS in apps/contacts/index.php

I found another XSS flaw in the Contacts function, creating a contact and adding this in any field:


will also execute. However, since you cannot share contacts between users (or can you?) I believe this is a minor problem.

XSS in Xymon

Last week I had some time to play with the monitoring tool Xymon. Xymon is a monitoring and alerting software mostly written in C. Its server component provides you with a web-interface to check the health of your systems. I only quickly investigated this web-interface.

After walking through the various kinds of pages I found that almost all parameters are correctly checked or sanitized. But on one page the code was doing something odd (criticalview.c):

fprintf(output, "<a href="\&quot;%s&amp;NKPRIO=%d&amp;NKTTGROUP=%s&amp;NKTTEXTRA=%s\&quot;">",

hostsvcurl(itm->;hostname, colname, 1),


htmlgroupstr, htmlextrastr);

This is strange, the htmlgroupstr and htmlextrastr variables do not get used anywhere else on criticalview.c. The link it generates points to svcstatus.c (well its wrapper, On that page the NKTTGROUP and NKTTEXTRA parameters simple get displayed on the page, without any further cleanup. With that we can generate links like this:


This nicely executes our injected JavaScript. Bug was reported at SourceForge.