Detecting Client Library Using HTTP Headers

A blog I discussed about using HTTP header ordering as a means to detect spoofing of user agents based on the actual application or library and how the API gets called for the HTTP library. https://drive.google.com/file/d/1iX-ZMhtkBJrl_PR1-b33044diA_MTXwG/view?usp=sharing It is research I performed many years before working for Perched but ended up being posted at Perched, and ultimately taken down after Elastic acquired it - somebody saved a PDF of the site before it was gone so here it is. No shout out to Elastic nor that previous company. I just want to link to my original work despite that company always saying the work I posted would exist.


What in the HELK (Release)

Recently Roberto and I have pushed a new update to HELK.
Some highlighted features of this update:

  • Elastic ELK Stack 6.6.1 and all the great new features in the trial and basic version
  • Drastically reduced the minimum requirements to install HELK
  • Allows ability to run HELK in small testing environments such as on your laptop -- make sure to still reference the installation section 
    • My test run, for the ELK + Kafka + KSQL components of HELK, was a VM with 3 cores, 5GB RAM, ingesting 1,000,000+ events from 3 devices.
  • I was able to run my setup at 5GB put I believe 4GB may be enough and the next few releases, this should definitely be attainable
  • Kibana uses URI cache instead of URIs that grow to be thousands of characters long.
  • ES tunings for storage and speed.
    • g1gc heap collection
    • template settings for niofs and such
  • Some additional enrichment's (I would probably refer to these as additions) -- explained in detail below
  • Finally better support for NXLog & Winlogbeat support simultaneously with no changes to the pipeline
    • NXLog can be sent over TCP on port 8531 to HELK
    • In future hoping to get it into Kafka, but currently that is a "paid" feature from NXLog.

Additional HELK Enrichment's/Additions

We have added some basic yet effective additions to some of the field's values in the HELK.
This includes length of values like CLI & powershell, hashes/fingerprints, and whether certain values contain NON ASCII characters.

New fields to highlight:
  • fingerprint_powershell_param_value_mm3 -- murmur3 hash of the powershell parameter values
    • more efficient stack counting (ie: term stacking the top 500)
    • more efficient inclusion/exclusion (ie: filter for OR filter out)
  • meta_powershell_scriptblock_text_length -- length of the entire powershell script block text
    • can help if needing to only search certain parameters in a powershell script > 50 characters or < 3000 characters.. Or looking for suspiciously long or short scripts
  • fingerprint_powershell_scriptblock_text_sha1 -- sha1 hash of the entire powershell script block text
    • more efficient stack counting (ie: term stacking the top 1000), it is far more efficient to stack 40 character strings versus (potentially up to) 32,000+ characters
    • more efficient inclusion/exclusion (ie: filter for OR filter out), it is far more efficient to filter out a 40 character string versus 30 different values consisting of hundreds or thousands of characters long
  • meta_process_command_line_length -- length of process command line
    • looking for suspiciously long or short values
  • fingerprint_process_command_line_mm3 -- hash of the command line value
    • more efficient stack counting (ie: term stacking the top 500)
    • inclusion/exclusion (ie: filter for OR filter out)
  • meta_process_command_line_has_non_ascii -- if there are any NON ASCII characters
    • useful for various evasion detection or suspicious parameters
With these new additions, make sure to refresh your index patterns inside of Kibana:
Kibana > Management > Index Patterns > (Select an) Index Name > Click the "recycle" button in the top right (if you hover over it, it will say "Refresh field list")

Quick note, fields prepended with "meta_" are additions that are not necessarily likely to be incorrect however they are data added on-top of original value -- therefore we are prepending  may be susceptible to change over time whenever parts of the values are cleaned up before hand. Honestly, many of them should never change, but now that we are adding more of these "meta_" fields, it was just time to explain it.
The best example I could give of something "changing", is the meta_ geo ip data -- this is data that can change over time. ie: if you are looking at an event/log from a year ago, the geo ip data is probably not the same as today. "" is probably the best example of something like that, over the last few years it has gone from a suspicious holder, to something in Australia to now Cloudflare.

Whole heck of a lot of the above to come in the following releases.

Let me know what you think, any issues, or if you yourself have any ideas you would like to see in HELK.

Cheers and happy hunting.