Scanning the Alexa top 1M sites for Dockerfiles

Intro

Recently I stumbled over a site which publicly served their Dockerfile. That particular instance wasn’t very interesting. But I started to wonder how widespread this is and what sites are exposing due to that.

By all means, this isn’t exactly new. You can find /Dockerfile in the SecLists repository for a while.
However, it seems that so far nobody (publicly) investigated this. I’m also still operating a bunch of sites that are in the top 1 million list and I couldn’t find a single request for this file in my (limited) log files.

So I’ve started to do my own scan of the Alexa top 1 Million sites list.
This work was heavily inspired by the research of Hanno Böck in the past and in particular I used his wonderful tool snallygaster to conduct most of the scans. Thanks Hanno!

What is a Dockerfile?

A Dockerfile is the blueprint of a container. It contains all commands needed to build it. It is a simple plaintext file. You can tell Docker to copy files into the container, expose network ports and of course run any command during the build, for example:


FROM nginx

COPY default.conf /etc/nginx/conf.d/default.conf

COPY html/ /usr/share/nginx/html

RUN echo "192.168.1.14 mysql" >> /etc/hosts

EXPOSE 80

Basically you describe exactly how the container is configured, which packages are installed and what commands are being ran in the process of building it.

As you can see it doesn’t necessarily contain sensitive information. In the above example we don’t even see which files are copied to the NGINX document root.

Results

Out of the 1’000’000 sites 659 served a Dockerfile.
There is large reuse of existing Dockerfiles, one in particular was used 105 times.
Overall this boils down to 338 unique Dockerfiles being served.

41 were used two times or more, in detail:

The remaining 298 were uniquely used by only one site.

Most of them did fairly innocent operations that didn’t tell us much such as:

Not much there that we couldn’t also figure out by looking at the site directly.

A lot of them gave us a very detailed view of what is probably running on the server, e.g.:

It’s nice to know exactly which PHP modules are used on the server, this might be useful in some cases.

But as I dug deeper I found sometimes not only the Dockerfile was exposed but also much of the referenced configuration files. For example in the Dockerfile “docker/nginx.conf” is copied:

Which we then can simply try to access like this:

Somewhat common in that scenario are TLS certificates and, well, keys. I’ve found around 10 of those, for example:

And some people simply do insane things in their Dockerfile, like exposing their AWS secret key:

Or using a tool called “sshpass” to pipe a password into ssh to automate a rsync:

And at least one SSH root key is being served:

Overall I found SSH keys, npm tokens, TLS keys, passwords, AWS secrets, Amazon SES credentials, countless configuration files and source code of some of the applications.

These are of course the extreme examples which are to be expected on such a wide range scan.

How does this happen?

By default the Dockerfile is not copied into the container and certainly not to a publicly served folder.

From what I can tell the mistake that most of these sites make is practically this (real example from the scan):

With the first COPY line they copy everything in the current folder to a publicly served folder.
Afterwards configuration files get copied.

With this both the nginx.conf and the complete ssl directory are public. We can now simply fetch the nginx.conf, lookup the name of the certificate and key files and then fetch those as well.

In some cases there was no such COPY command. I can only guess that the files ended up due to another mistake in the document root, possibly unrelated to Docker.

Conclusion

With only 0.66 % of sites exposing a Dockerfile this doesn’t look like a very widespread problem. And on top of that only a subset of those – less than 100 – expose really critical information that can lead to a compromise.

But in any case, it rarely makes sense to publicly serve a Dockerfile.
Even if you don’t include any keys, passwords or other secrets: It still doesn’t make sense to give everyone a blueprint of your system.
The sites that don’t expose anything critical right now might start in the future when changes are made to this seemingly private file.

It’s generally good advice – even if you don’t use Docker – to simply check your public webroot folder for any files that shouldn’t be there and remove them.

 

ISITDTU CTF 2018 Friss

The ISITDTU CTF 2018 – Friss challenge presented us only with a URL without any explanation, on that URL a single form field was displayed:

That form only accepted URLs which point to localhost.
On the page is also a comment in the HTML source code which gives us access to the debug version by appending ?debug=1 to the URL:

Here we find that a config.php is included. Since only the host part is checked against containing localhost we can request local files like this: file://localhost/etc/hosts

And we can also get the config.php file by requesting file://localhost/var/www/html/config.php:

In the config.php file we find MySQL connection details and information that the flag is probably stored in the table ssrf.flag.

We can send requests to MySQL by requesting http://127.0.0.1:3306/ but that is not very useful since we need to login and send a query over the MySQL binary protocol.
Fortunately curl still supports the gopher protocol with which we can send requests to MySQL without any additional headers.
Crafting the correct binary content is the hard part but that problem is also solved already. We’ve used this python script to create the payload. The author – Tarunkant – explained SSRF via gopher and his script very well here.

But this script still requires the raw authentication packet. We’ve started a MySQL server and then connected to it via mysql -h 127.0.0.1 -u ssrf_user (login does not need to succeed).
Sniffing the traffic with wireshark we get the following authentication packet (follow the TCP stream and filter only for client to server traffic):

Switch to raw representation:

Run the above python script to generate the payload, as query we entered SELECT * FROM ssrf.flag;:

That produces the gopher URL:
gopher://127.0.0.1:3306/_%b3%00%00%01%85%a6%3f%20%00%00%00%01
%2d%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00
%00%00%00%00%73%73%72%66%5f%75%73%65%72%00%00%6d%79%73%71%6c
%5f%6e%61%74%69%76%65%5f%70%61%73%73%77%6f%72%64%00%71%03%5f
%6f%73%10%64%65%62%69%61%6e%2d%6c%69%6e%75%78%2d%67%6e%75%0c
%5f%63%6c%69%65%6e%74%5f%6e%61%6d%65%08%6c%69%62%6d%79%73%71
%6c%04%5f%70%69%64%04%31%36%36%33%0f%5f%63%6c%69%65%6e%74%5f
%76%65%72%73%69%6f%6e%07%31%30%2e%31%2e%32%39%09%5f%70%6c%61
%74%66%6f%72%6d%06%78%38%36%5f%36%34%0c%70%72%6f%67%72%61%6d
%5f%6e%61%6d%65%05%6d%79%73%71%6c%19%00%00%00%03%53%45%4c%45
%43%54%20%2a%20%46%52%4f%4d%20%73%73%72%66%2e%66%6c%61%67%3b
%01%00%00%00%01

And when we request that, we get the flag:

The flag is: ISITDTU{JUST_4_SSrF_B4B3!!}

 

Google CTF 2018 shall we play a game

Although we haven’t managed to submit the flag correctly, I’m still publishing this write-up. Maybe it helps someone.

The Google CTF 2018 “shall we play a game?” challenge:

This was in the reverse engineering category, only included a link to an apk file (mirror) and the short description to win the game 1’000’000 times to get the flag.

I already had the setup to run and investigate Android Apps from a very great BSides Munich workshop Fun with Frida. In the end I didn’t end up using Frida to solve the challenge, but the setup alone helped a lot already.

Looking at the game, it is a very simple tic-tac-toe game with a win counter that goes up to 1’000’000:

Starting to reverse it I’ve used unpack-apk.sh to extract the apk file and attempt to decompile it as well. Looking at the source code at app/extracted/src/main/java/com/google/ctf/shallweplayagame/GameActivity.java we find that this is the main program. Two functions there are of interest to us (comments by me):

    // Display flag and some magic we don't understand
    void m() {
        Object _ = N._(Integer.valueOf(0), N.a, Integer.valueOf(0));
        Object _2 = N._(Integer.valueOf(1), N.b, this.q, Integer.valueOf(1));
        N._(Integer.valueOf(0), N.c, _, Integer.valueOf(2), _2);
        ((TextView) findViewById(R.id.score)).setText(new String((byte[]) N._(Integer.valueOf(0), N.d, _, this.r)));
        o();
    }

    void n() {
        // Reset the board, remove X and O from the board
        for (int i = 0; i < 3; i++) {
            for (int i2 = 0; i2 < 3; i2++) {
                this.l[i2][i].a(a.EMPTY, 25);
            }
        }
        k();
        // Increase win counter
        this.o++;
        // Some magic we don't understand
        Object _ = N._(Integer.valueOf(2), N.e, Integer.valueOf(2));
        N._(Integer.valueOf(2), N.f, _, this.q);
        this.q = (byte[]) N._(Integer.valueOf(2), N.g, _);
        // Check if win counter is 1'000'000
        if (this.o == 1000000) {
            // Show the flag
            m();
            return;
        }
        ((TextView) findViewById(R.id.score)).setText(String.format("%d / %d", new Object[]{Integer.valueOf(this.o), Integer.valueOf(1000000)}));
    }

My first attempts were to start the game with a win counter of already 999’999 or decrease the 1’000’000 to 2. But neither worked, we won the game but instead of the flag we’d get binary garbage displayed. It’s clear that the magic we don’t understand needs to run 1’000’000 to produce the correct string (line 21 – 23).

I’ve started to look at the assembly file app/extracted/smali/com/google/ctf/shallweplayagame/GameActivity.smali and the method n() as that’s where we need to make changes:


.method n()V
    .locals 10

    const v9, 0xf4240

This must be the right place, 0xf4240 is 1’000’000 in hex. We can somewhat easy find the increase of the win counter:


    iget v0, p0, Lcom/google/ctf/shallweplayagame/GameActivity;->o:I

    add-int/lit8 v0, v0, 0x1

    iput v0, p0, Lcom/google/ctf/shallweplayagame/GameActivity;->o:I

And further down is the win check:


    iget v0, p0, Lcom/google/ctf/shallweplayagame/GameActivity;->o:I

    if-ne v0, v9, :cond_2

    invoke-virtual {p0}, Lcom/google/ctf/shallweplayagame/GameActivity;->m()V

Between those sections, we need to add a new loop which runs 1’000’000 times. I’ve did that with the following patch:


--- orig/GameActivity.smali	2018-06-26 10:47:29.072510136 +0200
+++ patched/GameActivity.smali	2018-06-26 11:21:00.495157390 +0200
@@ -664,7 +664,8 @@
 .end method
 
 .method n()V
-    .locals 10
+    # Increase local variable count by one
+    .locals 11
 
     const v9, 0xf4240
 
@@ -676,6 +677,9 @@
 
     const/4 v6, 0x2
 
+    # Add new variable v10 with value 0
+    const v10, 0x0
+
     move v2, v1
 
     :goto_0
@@ -716,8 +720,20 @@
 
     add-int/lit8 v0, v0, 0x1
 
+    # move 1'000'000 into the win counter, we now only need to win once
+    move v0, v9
+
     iput v0, p0, Lcom/google/ctf/shallweplayagame/GameActivity;->o:I
 
+    # add new loop, goto_3 label
+    :goto_3
+
+    # break out condition, if v10 is 1'000'000 goto cond_3
+    if-ge v10, v9, :cond_3
+
+    # increase loop counter v10 by 1
+    add-int/lit8 v10, v10, 0x1
+
     new-array v0, v7, [Ljava/lang/Object;
 
     invoke-static {v6}, Ljava/lang/Integer;->valueOf(I)Ljava/lang/Integer;
@@ -786,6 +802,12 @@
 
     iput-object v0, p0, Lcom/google/ctf/shallweplayagame/GameActivity;->q:[B
 
+    # end of the for loop, jump back up to goto_3 label until v10 is 1'000'000
+    goto :goto_3
+
+    # break out label, jump here if v10 is 1'000'000
+    :cond_3
+
     iget v0, p0, Lcom/google/ctf/shallweplayagame/GameActivity;->o:I
 
     if-ne v0, v9, :cond_2

Now we just need to build a new apk file, zipalign it, sign it, install it on our emulator and run it:


# apktool b
# zipalign -v 4 dist/app.apk app.aligned.apk
# jarsigner -verbose -storepass android -keystore ~/.android/debug.keystore app.aligned.apk signkey
# adb install app.aligned.apk

And when we run it finally this screen is displayed:

The flag is CTF{ThLssOfInncncIsThPrcOfAppls} or CTF{ThLssOfInncncIsThPrcOfAppIs} or CTF{ThLssOflnncncIsThPrcOfAppls} – I’m still not sure.

XSS on forge.puppet.com

I found a vulnerability on forge.puppet.com which allowed me to store XSS on their module page for a module I own.
User interaction was still required to execute the JavaScript payload by hovering over a link on the page, thus the risk was rather limited.

The issue was that not all values in metadata.json of uploaded modules were correctly sanitized. You could upload a module with the following metadata.json payload (abbreviated):

  "operatingsystem_support": [
    {
      "operatingsystem":"CentOS",
      "operatingsystemrelease":[ "5", "6", "7<script>alert('xss')</script>" ]
    }
  ],

When a user then visited the module page and hovered over the “CentOS” link, to figure out which versions are supported, then the JavaScript payload would be executed:

This issue has been fixed by the Puppet team.

Timeline:
2018-03-24 – Issue was reported to the Puppet security team.
2018-04-01 – Asking for feedback if the report has been received.
2018-04-01 – Puppet security team confirms and says it’s added to their backlog.
2018-06-13 – Asking for feedback if the issue is resolved.
2018-06-13 – Puppet security team confirms it’s fixed, possibly already since March.

 

WPICTF 2018 guess5

The WPICTF 2018 “guess5” challenge:

The URL presented us with a guessing game, we have to pick 6 numbers. If we picked the correct numbers we’ll get a flag:

However submitting our picks never worked. Investigating this for a bit it looks like this requires to run a local Ethereum node. Before trying to set that up we’ve looked more into what the web application does. Interestingly it fetches the URL https://glgaines.github.io/guess5/Guess6.json (mirror here). In there we can find the ETH contract including in plain text for some reason. Which contains the flag:

The flag is: WPI{All_Hail_The_Mighty_Vitalik}

WPICTF 2018 Shell-JAIL-2

The WPICTF 2018 “Shell-JAIL-2” challenge:

This is almost the same challenge as Shell-JAIL-1 (see my write-up here for explanation of details) with the exception of one extra line in access.c (full mirror here):

        setenv("PATH", "", 1);

This means that before dropping the arguments to system() the $PATH environment variable is unset. Also the blacklist filter of the previous challenge remains the same. With that only built in sh commands will continue to work and since / is also blacklisted we cannot provide full paths to binaries either. For example id will now not work while pwd still executes:

But the . (or source) command still works. With that we can tell the shell to try to execute the flag.txt file and the error message will reveal its content. We still use the ? wildcard to circumvent the other blacklist by passing . "fl?g.t?t" to it:

The flag is: wpi{p0s1x_sh3Lls_ar3_w13rD}

WPICTF 2018 Shell-JAIL-1

The WPICTF 2018 “Shell-JAIL-1” challenge:

After downloading the linked private key and connecting to the remote server we are dropped into a limited user account and the directory /home/pc_owner. In that folder there are only 3 files – including flag.txt to which our user has no access:

The access file is basically a setuid executable which will run as the pc_owner user. The source of the executable is also available in access.c (mirror here). The program will take all arguments and pass it to system() unless it contains blacklisted strings, relevant parts in the source code:

int filter(const char *cmd){
	int valid = 1;
	valid &amp;= strstr(cmd, "*") == NULL;
	valid &amp;= strstr(cmd, "sh") == NULL;
	valid &amp;= strstr(cmd, "/") == NULL;
	valid &amp;= strstr(cmd, "home") == NULL;
	valid &amp;= strstr(cmd, "pc_owner") == NULL;
	valid &amp;= strstr(cmd, "flag") == NULL;
	valid &amp;= strstr(cmd, "txt") == NULL;
	return valid;
}


int main(int argc, const char **argv){
	setreuid(UID, UID);
	char *cmd = gen_cmd(argc, argv);
	if (!filter(cmd)){
		exit(-1);
	}
	system(cmd);
}

This means passing id to it will work but cat flag.txt will not:

Of course circumventing that filter is rather easy, the * wildcard is forbidden, but ? is not. We can use those wildcards to read flag.txt by passing cat "fla?.tx?" to it:

The flag is: wpi{MaNY_WayS_T0_r3Ad}

Nuit du Hack CTF 2018 CoinGame

The Nuit du Hack CTF 2018 CoinGame challenge:

The URL presented us basically only with a simple webform, which fetches a resource we can specify via cURL:

After a bit of trying, we figured out that file:/// URLs also work, like file:///etc/passwd:

Fetching a lot of files from the server yielded not a lot of success. After a while we noticed the text on the main site: “DESIGNED BY TOTHEYELLOWMOON”

Searching for this and CoinGame a GitHub repo was found: https://github.com/totheyellowmoon/CoinGame
The description of that repo read: “Congrats it was the first step ! Welcome on my Github, this is my new game but I haven’t pushed the modifications …”

From the description of the challenge and the GitHub repo we gather that “CoinGame” is being developed on this server and some changes aren’t pushed yet to the repo.
From /etc/passwd and /var/log/dpkg.log on the server we’ve also figured out that probably a tftp server is running on that system.

Requesting http://coingame.challs.malice.fr/curl.php?way=tftp://127.0.0.1/README.md we found the local repository:

Next we cloned the public GitHub repo, with that we had a list of all existing files in the repository. We looped over all the files and downloaded them via tftp from the system. Then simply ran a diff on the checkout and downloaded files. None of the code had any differences, but a few pictures didn’t match:

In any of the gameAnimationImages/background*.png images the flag was visible:

The flag was: flag{_Rends_L'Arg3nt_!}

iOS camera QR code URL parser bug

I’ve learned recently that the iOS 11 camera app will now automatically scan QR codes and interpret them.
This is pretty cool, until now you needed special apps to do that for you on iOS.
When scanning a QR code which contains a URL – in this case https://infosec.rm-it.de/ –  iOS will show a notification like this:

Naturally the first thing I want to try is to construct a QR code which will show an unsuspicious hostname in the notification but then open another URL in Safari.

And this is exactly what I found after a few minutes. Here it is in action:

There is no redirect misuse being done on facebook.com, Safari will only access infosec.rm-it.de.

Details:

If you scan this QR code with the iOS (11.2.1) camera app:

The URL embedded in the QR code is:
https://xxx\@facebook.com:443@infosec.rm-it.de/

It will show this notification:

But if you tap it to open the site, it will instead open https://infosec.rm-it.de/:

The URL parser of the camera app has a problem here detecting the hostname in this URL in the same way as Safari does.
It probably detects “xxx\” as the username to be sent to “facebook.com:443”.
While Safari might take the complete string “xxx\@facebook.com” as a username and “443” as the password to be sent to infosec.rm-it.de.
This leads to a different hostname being displayed in the notification compared to what actually is opened in Safari.

This issue has been reported to the Apple security team on 2017-12-23.
As of today (2018-03-24) this is still not fixed.

Update:
On 2018-04-24 this has been fixed with iOS 11.3.1 and macOS 10.13.4.
CVE-2018-4187 has been assigned to both issues.

 

NeverLAN CTF 2018 JSON parsing 2

The NeverLAN CTF challenge JSON parsing 1:

The linked file can be found here.

The JSON file contains a minute of VirusTotal scan logs. The challenge wants us to provide a SHA256 hash of a PE resource which most commonly by multiple users. In the data there is the unique_sources field, this will show us which file was uploaded the most by unique users.

Basically I use a short Python script to format the JSON to be easier read and find the highest number of unique_sources, then search the full file for that record.

from pprint import pprint
import json

with open('file-20171020T1500') as f:
    for line in f:
        data = json.loads(line)
        pprint(data)

Running this script like this:

python json2.py |fgrep 'unique_sources' | cut -d ' ' -f 3|sort -n | tail -1

Will find that there is one record with a unique_sources count of 128.
Searching for like this in the full file:

fgrep 'unique_sources": 128' file-20171020T1500

We get the full scan record back, submitting any of the PE resources SHA256 hashes will work as the flag.