Detecting Client Library Using HTTP Headers

cross post at: https://www.perched.io/blog/2019/5/6/detecting-client-library-using-http-headers

This post describes a way passively, using Zeek(Bro) + the Elastic Stack within RockNSM, to detect the library used to make a web request using HTTP headers.

However, when it comes to HTTP the main focus has always been on using the layer 7 application details of the HTTP User-Agent header. Although this may be true, these can be spoofed and typically are replaced with any User-Agent of choice.

Using the described method has positive and negative implications for both Blue Teams (defenders) and Red Teams (attackers).


What in the HELK (Release)

Recently Roberto and I have pushed a new update to HELK.
Some highlighted features of this update:

  • Elastic ELK Stack 6.6.1 and all the great new features in the trial and basic version
  • Drastically reduced the minimum requirements to install HELK
  • Allows ability to run HELK in small testing environments such as on your laptop -- make sure to still reference the installation section 
    • My test run, for the ELK + Kafka + KSQL components of HELK, was a VM with 3 cores, 5GB RAM, ingesting 1,000,000+ events from 3 devices.
  • I was able to run my setup at 5GB put I believe 4GB may be enough and the next few releases, this should definitely be attainable
  • Kibana uses URI cache instead of URIs that grow to be thousands of characters long.
  • ES tunings for storage and speed.
    • g1gc heap collection
    • template settings for niofs and such
  • Some additional enrichment's (I would probably refer to these as additions) -- explained in detail below
  • Finally better support for NXLog & Winlogbeat support simultaneously with no changes to the pipeline
    • NXLog can be sent over TCP on port 8531 to HELK
    • In future hoping to get it into Kafka, but currently that is a "paid" feature from NXLog.

Additional HELK Enrichment's/Additions

We have added some basic yet effective additions to some of the field's values in the HELK.
This includes length of values like CLI & powershell, hashes/fingerprints, and whether certain values contain NON ASCII characters.

New fields to highlight:
  • fingerprint_powershell_param_value_mm3 -- murmur3 hash of the powershell parameter values
    • more efficient stack counting (ie: term stacking the top 500)
    • more efficient inclusion/exclusion (ie: filter for OR filter out)
  • meta_powershell_scriptblock_text_length -- length of the entire powershell script block text
    • can help if needing to only search certain parameters in a powershell script > 50 characters or < 3000 characters.. Or looking for suspiciously long or short scripts
  • fingerprint_powershell_scriptblock_text_sha1 -- sha1 hash of the entire powershell script block text
    • more efficient stack counting (ie: term stacking the top 1000), it is far more efficient to stack 40 character strings versus (potentially up to) 32,000+ characters
    • more efficient inclusion/exclusion (ie: filter for OR filter out), it is far more efficient to filter out a 40 character string versus 30 different values consisting of hundreds or thousands of characters long
  • meta_process_command_line_length -- length of process command line
    • looking for suspiciously long or short values
  • fingerprint_process_command_line_mm3 -- hash of the command line value
    • more efficient stack counting (ie: term stacking the top 500)
    • inclusion/exclusion (ie: filter for OR filter out)
  • meta_process_command_line_has_non_ascii -- if there are any NON ASCII characters
    • useful for various evasion detection or suspicious parameters
With these new additions, make sure to refresh your index patterns inside of Kibana:
Kibana > Management > Index Patterns > (Select an) Index Name > Click the "recycle" button in the top right (if you hover over it, it will say "Refresh field list")

Quick note, fields prepended with "meta_" are additions that are not necessarily likely to be incorrect however they are data added on-top of original value -- therefore we are prepending  may be susceptible to change over time whenever parts of the values are cleaned up before hand. Honestly, many of them should never change, but now that we are adding more of these "meta_" fields, it was just time to explain it.
The best example I could give of something "changing", is the meta_ geo ip data -- this is data that can change over time. ie: if you are looking at an event/log from a year ago, the geo ip data is probably not the same as today. "" is probably the best example of something like that, over the last few years it has gone from a suspicious holder, to something in Australia to now Cloudflare.

Whole heck of a lot of the above to come in the following releases.

Let me know what you think, any issues, or if you yourself have any ideas you would like to see in HELK.

Cheers and happy hunting.


Finding Malicious Chrome Plugins Using ELK and Bro HTTP Logs

This blog will discuss using the HTTP header "Origin" combined with Bro NSM & Elastic ELK for a few different scenarios to detect malicious activity, general suspicious/anomalous activity, or as an added network "forensic" artifact.
You can see the RFC for the Origin header here:

The HTTP Origin header can be used to aide in detecting

  • Google Chrome (Browser) plugins/add-on's.
    The purpose of this blog -- which will be discussed in length after describing the other methods.
  • CSRF.
    multiple ways to use ELK and Bro to potentially find some issues as described at that OWASP link using your creativity and the referer, origin, and host HTTP header fields.
  • Historical artifact of where a connection came from.
    This could be used along with the HTTP Referer header. This is described in the qoute ""In some sense, the origin granularity is a historical artifact of how the security model evolved." from 
    Also, scenarios where an origin could be a "file://" as described here:

The Problem/Threat

Now going back to Chrome (Browser) plugins. For brevity and the fact that Google Chrome is the majority used browser today we will only be focusing on plugins for it.

Browser plugins are a really interesting way to perform malicious actions -- these actions would include key logging of the browser session, web site data, each site visited, screenshot of pages visited, mine crypto-currencies, and plugin's are so powerful that can even fully remote control the device.

These actions regarding data, screenshots, and key loggers are especially useful now a days given that the majority of employees tasks are performed in the browser along with company email, banking, social, etc...

Also, lets say for example you have a user who connects to your VPN network and has a malicious browser plugin that is taking screenshots of every site the user visits. Although the communication of the plugin to its C2 may be blocked while on the VPN -- once the user disconnects from the VPN everything will be uploaded.
Therefore, think if they are performing admin tasks for firewalls/switches/etc in their browser or accessing a web database or some other internal site hosting intellectual property. 

Finally, they are a lot less likely to be picked up by network security monitoring techniques such. Traditional detection measures that may yield no results in detecting plugins:
-Sandboxing == I am not sure of any sandboxes that even run Chrome Plugins (if you know of any then please comment and I will update the blog accordingly)
-IDS signatures / AntiVirus == Not only are static/signature based detection mechanisms a shortcoming as your only means of coverage but Chrome Plugins are a mix of JavaScript/HTML/CSS and these languages can be obfuscated to the moon and back. example:
text > hex > utf16 > base64
TEXT=i am a malicious plugin

How HTTP Origin Looks on the Wire/Network

Although plugins may be difficult to detect from a signature perspective they leave very specific fingerprint on the network when they make a connection. This is specifically in the "Origin" HTTP header. When Google Chrome plugin makes a request the actual chrome extension UID is in the Origin header/field!
Origin: chrome-extension://pgeolalilifpodheeocdmbhehgnkkbak

You can then perform a google search on the UID (which is "pgeolalilifpodheeocdmbhehgnkkbak") to see what the actual plugin is from the Chrome Web Store. You can also use:

In my example I had to use the search method because this plugin was something I found back from 2015 and the UID has changed.

The plugin from this PCAP is taking each URL that a user visits and double base64 encoding and then POSTing it.

The Solution/Hunt

Bro Setup (Adding HTTP Origin)

If you leverage Bro logs then we can easily perform searching for malicious/suspicious Chrome plugin's.
The first step is to add to the HTTP Origin header to the Bro http.log. This can be accomplished by creating a simple bro script to add the field as seen here:

In order to accomplish this as a whole (ie: restart bro and everything to add the log) you would perform the following:
mkdir -p $CUSTOMDIR;

# Download the Bro script
wget https://gist.githubusercontent.com/neu5ron/cbfca0dfc42b1d6c96cd321d687e5495/raw/af6f2a36bef23ad77c34f1a4928fb5f36a30fc46/additional_http_headers-main.bro;
mv additional_http_headers-main.bro main.bro;
# Create the additional files needed to make the script a local Bro "package"
echo "@load ./main.bro" > __load__.bro
# Move all files to the $CUSTOMDIR
mv __load__.bro main.bro $CUSTOMDIR;

# Edit the bro directory to include our new script
echo "@load $CUSTOMDIR" >> $BRODIR/share/bro/site/local.bro;

Now assuming you have Bro HTTP  logs already going to ELK we can perform some queries and stack counting analysis to find outliers/anomalies of chrome extensions.

Exploring the Data in ELK

Performing a query to find any request made from a plugin is as simple as

Visualizing the Data in ELK

Now performing stack counting to find outliers/anomalies is just as simple as creating a data table visualization on origin.keyword.

We can then add the query/search from above (after saving it) and the visualization (after saving it) to a dashboard/single-view.

My example is showing a 2 day query and only returned 2 chrome extensions.
However, your environment may have many more. If your environment has many more plugins do not fear because stack counting is here.

Baselining (Stack Counting) or Whitelisting/Filtering

You can quickly whitelist a plugin's UID (using techniques described above by searching what the plugin actually is via google/chrome-store) so you never have to see the result again. Also, you may perform even more strenuous filtering by adding "method:(POST OR PUT)" to your query.
However, this example is of a very large network and searching across 432,365,000+ http logs. Even if there were 25+ extensions making outbound connections it would take less than 5 minutes to whitelist any of the legitimate extensions and then you do not have to worry about making your hunt more strenuous then it needs to be...which would result in possibly missing positives/malicious-activity.

Alerting Instead of Hunting/Exploring

In regards to all of those who don't want to look at a dashboard or search/query..
After excluding any false positives -- we can create our search into an X-Pack Watcher Alert and then email/notify us any time there is an extension.
Which would mean you don't have to worry about malicious chrome plugin's going unnoticed and can focus on other things :)


Typosquatting Detection with ELK & Bro NSM

Typosquatting Detection Using Elastic ELK & Bro NSM

DNS... I hope as network defenders we all know the value of it. Some may not, as my technical/CND lead once told me "so what its just DNS". If thats the case here is a short list:
From the ability to cheaply block common malware/C2's, content related to at work law-suits, and SSL (via the hostname lookup). Just as commonly as it can be used to "defend" it is a great tool to proactively block malicious ad redirects (by blocking advertisement domains) or dynamic DNS. Yes...There was a time were you could simply block many "APT" groups by blocking dynamic DNS lookups.
However outside of a purely "defensive" standpoint the importance of DNS logs to be able to detect DNS exfil, DGA bot (simple way via nxdomain, sometimes more complex is required), and DNS as a C2 channel -- as well as (retro-)hunt.

I want to show a use case of how Elastic ELK can be used to "hunt" in order to find typo-squatting domains. Also, DNS was what got me into information security and I have been wanting to blog about the things I have been doing with Bro (DNS) and ELK for years.. but I never take the time...
So I will make this quick :)

Everyone loves ambulance chasers in infosec ;) so what better way than to write a quick blog regarding typo-squatting detection shortly after Brian Krebs' recent article:

Elastic has a very powerful text/string analysis engine and with it you can perform queries that they refer to as "fuzzy" levenshtein distance. Therefore, lets look for some domains that have some form of character addition/substitution.
Lets look at some popular domains that are sometimes spoofed and then used in malware C2 comms -- for this example typo-squatting on google, microsoft, and only a few other domains (for the sake of brevity).
** Get creative and look for typo-squatting for your company :) bad folks love to use your companies/entities domain name for their C2. If you have to tune your search it will only take a few moments and then after that you could even turn your search into an alert. That gets sent via email(or other form of comms like slack/text) by using Elastic X-Pack Watcher OR something like elastalert. **

I have performed normalization on domain names that allow me to perform exact (match) queries on each level of a domain. for example "www.google.com". www = 3rd level, google = 2nd level, com = 1st level. The query for your environment may be instead "query:google~1".

Last visualization.. If you normalize DNS, HTTP, and SSL into a common schema then you can even perform one query to see all connections with a typo-squatting domain. Then you can quickly tell not only if a DNS lookup but an actual HTTP(s) connection. Also, you perform aggregations which is the other true power of elastic..
searching + aggregations = victory