2021-03-03

Elastic Wildcard ECS Whirlwind

Common Elastic Issues Regarding Cost & Search that Still Exist in 2023
Unposted blog from 2020 that are still true in 2023 and will be beyond.

Elastic Common Schema is rolling back the wildcard data type (security use case searching savior) from ECS 1.8...

reference 1 https://github.com/elastic/ecs/issues/1233

reference 2 https://github.com/elastic/ecs/pull/1237



For prior reading on wildcard data type, keyword data type, text/analyzed fields, case insensitive & case insensitivity searching on Cyber security related data/logs, all the while with/around/using Elastic Common Schema and logging use cases:


I noticed in reference 2 github comment that Elastic discovered "some notable performance issues related to storage size and indexing throughput that we must have time to review and address in a comprehensive way".


Right..... indexing things increases storage VS storing the thing as is. It's a 1 x $IndexTerms.. UNLESS you get good compression ratios. Usually good compression comes at the cost of a CPU resource whether client, server, index, or somewhere else (more on that later ;)
However, compression and ultimately reduction in storage was a huge thing that Elastic touted in their big announcement of the wildcard data type.. 


As I started digging further down the rabbit hole of why Elastic decided to rollback wildcard in 1.8 given that case sensitive log/search bypasses has been well documented and communicated for almost 2 years now....


I noticed a very peculiar comment on the PR to Lucene that added the sauce (code) for making this compression for the wildcard data type better... The comment goes "There's a trade-off here between efficient compression (more docs-per-block = better compression) and fast retrieval times (fewer docs-per-block = faster read access for single values)"


OK... It should be pretty clear, but look also at the wording of many of the PRs... You will see things like "most cases" or "if" data is similar or different.
In short, IT DEPENDS...
There is trade off's in databases as a whole, let alone sub components of them. Whether it is elasticsearch, some SQL db, you name it..

I just want to know how we got here. How did we mess up the ability to do a search for does a value contain XYZ..regardless of upper/lower/space/etc..
Elasticsearch could always do that before. Side bar...Yes, the analyzed field was not perfect for security use cases, but it was there and easier to work around its shortcomings than the situation the cyber security community is in now (mostly that nobody knows their searches are not returning the results they expect).. 
The company could have just created a community analyzer like that neu5ron person.. I think he even worked there too at one point ;) 

Even if wildcard data type had fixed everything by now, you still lose other powerful aspects of searching in Lucene (elasticsearch backend).

Such as fuzzy/Levenshtein distance, term/ordering queries... so on and so forth... The things that are/still useful for security use cases. The things elasticsearch as a whole is useful for in most/all use cases let alone cyber.

Can somebody at Elastic tell me what was wrong with keyword (data type).. and setting the doc values to 10,000+.. global ordinals.. and create a custom text analyzer. Whats this wildcard data type get us that is so special it needs it's own brand new data type.. and needed to be licensed before the great big license change even happened.


We would have solved the vast majority of the issues by now (free text search :) kept the other searching functionalities.. less template/mapping changes... everybody roasting marshmallows and searching for bad folks on their networks. 

The only explanation(s) I can think of for why the wildcard data type fiasco occurred:
Was to decrease "storage" for licensing/purchase cost...
or perhaps some Amazon debacle - because the wildcard data type had become licensed (before even the big license change situation).
It also does not help.. if there is nobody within or empowered within Elastic's organization who is (what I like to call) a "glue person". This would be somebody that transcends multiple aspects of the business and use case.. In this example, somebody who knows the security use cases, backend/lucene (even a small amount is all that would have been needed), actively or recently deployed in a production environment, maintained a deployment, AND most importantly using it like an analyst would use the data..and works with the cyber community.

But lets think for a second.. Storage is one of the cheapest computing resources there is (vs CPU/RAM).. 

So then what..?!
This is where it all gets muddy... Perhaps increases in storage was such a big deal because there is a bigger pricing issue.. A catch 22 where they shoot themselves in the foot and come in at a higher cost than anybody would expect because having to license more nodes (based on that additional storage)..
NOT TO MENTION... shooting themselves in the foot when moving a lot of the parsing/ECS stuff to elasticsearch "ingest" nodes that are a licensed node... Compression overhead = more compute.. more compute = more licensed nodes..more licensed nodes = more license cost...
or this is an genius evil business model :) 

However, I don't think that storage increase is the real cost factor if it is done realistically. I think that this is a cloud storage licensing model issue.. Combined with what I think is the biggest thing, which is some religious (sales) document out there that says X amount of TBs per X amount of (licensed) nodes "NO MORE NO LESS"... and those numbers are pretty unrealistic I would assume. 
Because, after X amount of days of immediately available (HOT architecture) data where there is an overlap of write & read at the same time - its not a huge concern to have much larger disks for a single server/resource-unit.......


As it still stands, I am completely uncertain what the need for wildcard data type was.

2019-05-08

Detecting Client Library Using HTTP Headers

A blog I discussed about using HTTP header ordering as a means to detect spoofing of user agents based on the actual application or library and how the API gets called for the HTTP library. https://drive.google.com/file/d/1iX-ZMhtkBJrl_PR1-b33044diA_MTXwG/view?usp=sharing It is research I performed many years before working for Perched but ended up being posted at Perched, and ultimately taken down after Elastic acquired it - somebody saved a PDF of the site before it was gone so here it is. No shout out to Elastic nor that previous company. I just want to link to my original work despite that company always saying the work I posted would exist.

2019-02-24

What in the HELK (Release)

Recently Roberto and I have pushed a new update to HELK.
Some highlighted features of this update:

  • Elastic ELK Stack 6.6.1 and all the great new features in the trial and basic version
  • Drastically reduced the minimum requirements to install HELK
  • Allows ability to run HELK in small testing environments such as on your laptop -- make sure to still reference the installation section 
    • My test run, for the ELK + Kafka + KSQL components of HELK, was a VM with 3 cores, 5GB RAM, ingesting 1,000,000+ events from 3 devices.
  • I was able to run my setup at 5GB put I believe 4GB may be enough and the next few releases, this should definitely be attainable
  • Kibana uses URI cache instead of URIs that grow to be thousands of characters long.
  • ES tunings for storage and speed.
    • g1gc heap collection
    • template settings for niofs and such
  • Some additional enrichment's (I would probably refer to these as additions) -- explained in detail below
  • Finally better support for NXLog & Winlogbeat support simultaneously with no changes to the pipeline
    • NXLog can be sent over TCP on port 8531 to HELK
    • In future hoping to get it into Kafka, but currently that is a "paid" feature from NXLog.



Additional HELK Enrichment's/Additions

We have added some basic yet effective additions to some of the field's values in the HELK.
This includes length of values like CLI & powershell, hashes/fingerprints, and whether certain values contain NON ASCII characters.


New fields to highlight:
  • fingerprint_powershell_param_value_mm3 -- murmur3 hash of the powershell parameter values
    • more efficient stack counting (ie: term stacking the top 500)
    • more efficient inclusion/exclusion (ie: filter for OR filter out)
  • meta_powershell_scriptblock_text_length -- length of the entire powershell script block text
    • can help if needing to only search certain parameters in a powershell script > 50 characters or < 3000 characters.. Or looking for suspiciously long or short scripts
  • fingerprint_powershell_scriptblock_text_sha1 -- sha1 hash of the entire powershell script block text
    • more efficient stack counting (ie: term stacking the top 1000), it is far more efficient to stack 40 character strings versus (potentially up to) 32,000+ characters
    • more efficient inclusion/exclusion (ie: filter for OR filter out), it is far more efficient to filter out a 40 character string versus 30 different values consisting of hundreds or thousands of characters long
  • meta_process_command_line_length -- length of process command line
    • looking for suspiciously long or short values
  • fingerprint_process_command_line_mm3 -- hash of the command line value
    • more efficient stack counting (ie: term stacking the top 500)
    • inclusion/exclusion (ie: filter for OR filter out)
  • meta_process_command_line_has_non_ascii -- if there are any NON ASCII characters
    • useful for various evasion detection or suspicious parameters
With these new additions, make sure to refresh your index patterns inside of Kibana:
Kibana > Management > Index Patterns > (Select an) Index Name > Click the "recycle" button in the top right (if you hover over it, it will say "Refresh field list")

Quick note, fields prepended with "meta_" are additions that are not necessarily likely to be incorrect however they are data added on-top of original value -- therefore we are prepending  may be susceptible to change over time whenever parts of the values are cleaned up before hand. Honestly, many of them should never change, but now that we are adding more of these "meta_" fields, it was just time to explain it.
The best example I could give of something "changing", is the meta_ geo ip data -- this is data that can change over time. ie: if you are looking at an event/log from a year ago, the geo ip data is probably not the same as today. "1.1.1.1" is probably the best example of something like that, over the last few years it has gone from a suspicious holder, to something in Australia to now Cloudflare.



Whole heck of a lot of the above to come in the following releases.


Let me know what you think, any issues, or if you yourself have any ideas you would like to see in HELK.


Cheers and happy hunting.

2018-04-17

Finding Malicious Chrome Plugins Using ELK and Zeek (Bro) HTTP Logs

This blog will discuss using the HTTP header "Origin" combined with Zeek (Bro) NSM & Elastic ELK for a few different scenarios to detect malicious activity, general suspicious/anomalous activity, or as an added network "forensic" artifact.
You can see the RFC for the Origin header here:
https://tools.ietf.org/html/rfc6454



The HTTP Origin header can be used to aide in detecting


  • Google Chrome (Browser) plugins/add-on's.
    The purpose of this blog -- which will be discussed in length after describing the other methods.
  • CSRF.
    https://www.owasp.org/index.php/Cross-Site_Request_Forgery_(CSRF)_Prevention_Cheat_Sheet
    multiple ways to use ELK and Bro to potentially find some issues as described at that OWASP link using your creativity and the referer, origin, and host HTTP header fields.
  • Historical artifact of where a connection came from.
    This could be used along with the HTTP Referer header. This is described in the qoute ""In some sense, the origin granularity is a historical artifact of how the security model evolved." from 
    https://tools.ietf.org/html/rfc6454#section-7
    Also, scenarios where an origin could be a "file://" as described here:
    https://tools.ietf.org/html/rfc6454#section-4



The Problem/Threat


Now going back to Chrome (Browser) plugins. For brevity and the fact that Google Chrome is the majority used browser today we will only be focusing on plugins for it.

Browser plugins are a really interesting way to perform malicious actions -- these actions would include key logging of the browser session, web site data, each site visited, screenshot of pages visited, mine crypto-currencies, and plugin's are so powerful that can even fully remote control the device.


These actions regarding data, screenshots, and key loggers are especially useful now a days given that the majority of employees tasks are performed in the browser along with company email, banking, social, etc...

Also, lets say for example you have a user who connects to your VPN network and has a malicious browser plugin that is taking screenshots of every site the user visits. Although the communication of the plugin to its C2 may be blocked while on the VPN -- once the user disconnects from the VPN everything will be uploaded.
Therefore, think if they are performing admin tasks for firewalls/switches/etc in their browser or accessing a web database or some other internal site hosting intellectual property. 

Finally, they are a lot less likely to be picked up by more traditional network monitoring techniques that do not involve data collection. Traditional detection measures that may yield no results in detecting plugins:
-Sandboxing == I am not sure of any sandboxes that even run Chrome Plugins (if you know of any then please comment and I will update the blog accordingly)
-IDS signatures / AntiVirus == Not only are static/signature based detection mechanisms a shortcoming as your only means of coverage but Chrome Plugins are a mix of JavaScript/HTML/CSS and these languages can be obfuscated to the moon and back. example:
text > hex > utf16 > base64
TEXT=i am a malicious plugin
HEX=6920616D2061206D616C6963696F757320706C7567696E
UTF-16=%u2069%u6D61%u6120%u6D20%u6C61%u6369%u6F69%u7375%u7020%u756C%u6967%u006E
BASE64=JXUyMDY5JXU2RDYxJXU2MTIwJXU2RDIwJXU2QzYxJXU2MzY5JXU2RjY5JXU3Mzc1JXU3MDIwJXU3NTZDJXU2OTY3JXUwMDZF

How HTTP Origin Looks on the Wire/Network

Although plugins may be difficult to detect from a signature perspective they leave very specific fingerprint on the network when they make a connection. This is specifically in the "Origin" HTTP header. When Google Chrome plugin makes a request the actual chrome extension UID is in the Origin header/field!
Origin: chrome-extension://pgeolalilifpodheeocdmbhehgnkkbak


















You can then perform a google search on the UID (which is "pgeolalilifpodheeocdmbhehgnkkbak") to see what the actual plugin is from the Chrome Web Store. You can also use:
https://chrome.google.com/webstore/detail/$PlaceUIDHere

In my example I had to use the search method because this plugin was something I found back from 2015 and the UID has changed.

The plugin from this PCAP is taking each URL that a user visits and double base64 encoding and then POSTing it.



The Solution/Hunt


Bro Setup (Adding HTTP Origin)

If you leverage Bro logs then we can easily perform searching for malicious/suspicious Chrome plugin's.
The first step is to add to the HTTP Origin header to the Bro http.log. This can be accomplished by creating a simple bro script to add the field as seen here:
https://gist.github.com/neu5ron/cbfca0dfc42b1d6c96cd321d687e5495

In order to accomplish this as a whole (ie: restart bro and everything to add the log) you would perform the following:
BRODIR="/opt/bro";
CUSTOMDIR="/opt/bro/custom/additional_http_headers";
mkdir -p $CUSTOMDIR;

# Download the Bro script
wget https://gist.githubusercontent.com/neu5ron/cbfca0dfc42b1d6c96cd321d687e5495/raw/af6f2a36bef23ad77c34f1a4928fb5f36a30fc46/additional_http_headers-main.bro;
mv additional_http_headers-main.bro main.bro;
# Create the additional files needed to make the script a local Bro "package"
echo "@load ./main.bro" > __load__.bro
# Move all files to the $CUSTOMDIR
mv __load__.bro main.bro $CUSTOMDIR;

# Edit the bro directory to include our new script
echo "@load $CUSTOMDIR" >> $BRODIR/share/bro/site/local.bro;


Now assuming you have Bro HTTP  logs already going to ELK we can perform some queries and stack counting analysis to find outliers/anomalies of chrome extensions.


Exploring the Data in ELK


Performing a query to find any request made from a plugin is as simple as
origin:chrome










Visualizing the Data in ELK

Now performing stack counting to find outliers/anomalies is just as simple as creating a data table visualization on origin.keyword.

We can then add the query/search from above (after saving it) and the visualization (after saving it) to a dashboard/single-view.










My example is showing a 2 day query and only returned 2 chrome extensions.
However, your environment may have many more. If your environment has many more plugins do not fear because stack counting is here.


Baselining (Stack Counting) or Whitelisting/Filtering

You can quickly whitelist a plugin's UID (using techniques described above by searching what the plugin actually is via google/chrome-store) so you never have to see the result again. Also, you may perform even more strenuous filtering by adding "method:(POST OR PUT)" to your query.
However, this example is of a very large network and searching across 432,365,000+ http logs. Even if there were 25+ extensions making outbound connections it would take less than 5 minutes to whitelist any of the legitimate extensions and then you do not have to worry about making your hunt more strenuous then it needs to be...which would result in possibly missing positives/malicious-activity.


Alerting Instead of Hunting/Exploring

In regards to all of those who don't want to look at a dashboard or search/query..
After excluding any false positives -- we can create our search into an X-Pack Watcher Alert and then email/notify us any time there is an extension.
Which would mean you don't have to worry about malicious chrome plugin's going unnoticed and can focus on other things :)
\o/





2018-04-05

Typosquatting Detection with ELK & Zeek(Bro) NSM

Typosquatting Detection Using Elastic ELK & Zeek(Bro) NSM


DNS... I hope as network defenders we all know the value of it. Some may not, as my technical/CND lead once told me "so what its just DNS". If thats the case here is a short list:
From the ability to cheaply block common malware/C2's, content related to at work law-suits, and SSL (via the hostname lookup). Just as commonly as it can be used to "defend" it is a great tool to proactively block malicious ad redirects (by blocking advertisement domains) or dynamic DNS. Yes...There was a time were you could simply block many "APT" groups by blocking dynamic DNS lookups.
However outside of a purely "defensive" standpoint the importance of DNS logs to be able to detect DNS exfil, DGA bot (simple way via nxdomain, sometimes more complex is required), and DNS as a C2 channel -- as well as (retro-)hunt.

I want to show a use case of how Elastic ELK can be used to "hunt" in order to find typo-squatting domains. Also, DNS was what got me into information security and I have been wanting to blog about the things I have been doing with Bro (DNS) and ELK for years.. but I never take the time...
So I will make this quick :)

Everyone loves ambulance chasers in infosec ;) so what better way than to write a quick blog regarding typo-squatting detection shortly after Brian Krebs' recent article:
https://krebsonsecurity.com/2018/04/dot-cm-typosquatting-sites-visited-12m-times-so-far-in-2018/

Elastic has a very powerful text/string analysis engine and with it you can perform queries that they refer to as "fuzzy" levenshtein distance. Therefore, lets look for some domains that have some form of character addition/substitution.
Lets look at some popular domains that are sometimes spoofed and then used in malware C2 comms -- for this example typo-squatting on google, microsoft, and only a few other domains (for the sake of brevity).
** Get creative and look for typo-squatting for your company :) bad folks love to use your companies/entities domain name for their C2. If you have to tune your search it will only take a few moments and then after that you could even turn your search into an alert. That gets sent via email(or other form of comms like slack/text) by using Elastic X-Pack Watcher OR something like elastalert. **


I have performed normalization on domain names that allow me to perform exact (match) queries on each level of a domain. for example "www.google.com". www = 3rd level, google = 2nd level, com = 1st level. The query for your environment may be instead "query:google~1".

Last visualization.. If you normalize DNS, HTTP, and SSL into a common schema then you can even perform one query to see all connections with a typo-squatting domain. Then you can quickly tell not only if a DNS lookup but an actual HTTP(s) connection. Also, you perform aggregations which is the other true power of elastic..
searching + aggregations = victory


2018-01-04

Canary Files for Legitimate Access Abuse using WEF & ELK

Canary Files for Legitimate Access Abuse using WEF & ELK


Network security monitoring and endpoint security defenders face monumental tasks in attempting to detect computer breaches. Many face lack of support from upper-management, money, time, incorrect sensor placement, physical resources, and any other problem you can think of. In addition to lack of support and resources there is also the issue of vulnerabilities in software completely out of their control (https://www.troyhunt.com/everything-you-need-to-know-about3/) along with devices that leave the network for extended periods of time (ie: user on vacation, traveling, or working remotely).

However, despite these disadvantages (and more later) there is hope in a simple yet effective solution! There is a builtin and free resource that will detect legitimate access abuse (ie: lateral movement, recon of network/shares/files, etc...).
I will be assuming that an attacker and an insider threat are the same "threat". This is because an attacker will, at some point, gain legitimate credentials just as an insider threat would have legitimate credentials.

**If you are already familiar with canary files and the necessity of them in computer security monitoring then skip to the 
Prerequisites & Solution sections**
Thus introducing computer security "canary files" (for the purpose of this article I will only discuss canary files, however there are many other computer security canaries you can use). The name for computer security canaries are based on the real word canaries used in coal mines to detect carbon monoxide (https://arlweb.msha.gov/century/canary/canary.asp). In this real world example, if a canary was showing signs of distress then this was a "clear signal" of danger (carbon monoxide).

Just as coal miners may not have had the ability to detect danger (in their case physical limitations of the human body -- carbon monoxide is colorless, odorless, and tasteless) many network/endpoint defenders face limitations in order to detect breaches that are out of their control! With the use of a canary file we are assuming that if this file is accessed in any way that it would be of malicious/malign intent. This allows you to perform additional duties while having something "watching your back" just as the coal miners could continue to work without having to worry about detecting carbon monoxide. 

Were real canaries the only way to detect trouble/danger? Doubtful... Are computer security canaries the only way to detect a breach/compromise? Nope.. The goal is to provide an easy win (Zero 2 Hero) in detecting a breach/compromise. Especially, if you face any of the following limitations:
  1. No internet access -- Most canaries require some sort of internet access. This article even covers if a device leaves the network and never connects to the internet but a canary is accessed.
  2. No third party software allowed (or wanted that may increase attack surface)  -- We will be using Windows logs builtin to the Windows operating system that use Kerberos/default windows authentication (so it already is in existence in your network)
  3. You do not have a team of people monitoring your network 24x7 or you are just one of a few persons monitoring everything while also wearing 10 other hats.
  4. Detecting breaches does not make your company money (unless you are a re-seller/product/vendor -- obviously...) and the companies requirements are access and up-time. You have to assume your admins are performing all sorts of unwanted activities that may not be "malicious" but would trigger many other alerts from other products (ie:AV).. On a management network everything is an anomaly...everyone is installing software, everyone is troubleshooting issues at odd hours in odd locations of the world (ie: while on vacation).
There has been a lot of work done with canary files in computer security already. Specifically https://github.com/thinkst/canarytokens has a large and great use case of different types of canaries and they have been discussing/using these for 3+ years. They even have tokens for HTTP URL, DNS, QR Code, and more.  Also, there is already public discussions/blogs of canaries using builtin windows resources but these seem to be limited to ransomware (ie: https://www.eventsentry.com/blog/2016/03/defeating-ransomware-with-eventsentry-auditing.html).

So.. You may be asking yourself.. Nate why are you re-inventing the wheel.. and to that I say: I am not, I am just polishing the wheel someone already invented. Trust me, I would much rather use someone else's work due to my own limited time.


I am proposing/showing an alternate solution simply due to the fact that one of the best solutions requires third party software and most other discussions on using windows logs are centered around ransomware. I want to show that this requires no software checklist approval, money, and little to no resources. Leveraging something that is already builtin to Microsoft Windows OS is a huge hurdle to overcome when getting buy-in from your other IT departments (the one's who will probably have the permissions to deploy it) as well as if you are in a restrictive environment such as the Government :)

Proposed Solution

Using builtin windows event logs (EventID:4663) that is deployed with a custom group policy and sent to the Elastic ELK stack (all are free) we are able to create our own builtin canary files. These builtin windows logs will provide additional benefits (for the purpose of canary files) than Sysmon and other Windows Logs (ie: 4688). Please note, that sysmon should always be used if you can use it.. Just in the case of canaries, Windows EventID:4663 provides a broader scope of detection possibilities (shown later).


Prerequisites 

  1. You already have a Windows Event Forwarding (WEF) server setup. If not then please see the following (and reach out with any questions):
    • https://medium.com/@palantir/windows-event-forwarding-for-network-defense-cb208d5ff86f
    • https://github.com/palantir/windows-event-forwarding/blob/master/group-policy-objects/README.md
    • https://docs.microsoft.com/en-us/windows/threat-protection/use-windows-event-forwarding-to-assist-in-instrusion-detection
    • https://blogs.technet.microsoft.com/jepayne/2015/11/23/monitoring-what-matters-windows-event-forwarding-for-everyone-even-if-you-already-have-a-siem/
    • https://mva.microsoft.com/en-US/training-courses-embed/event-forwarding-and-log-analysis-16506/Video-Audit-Policy-KBwQ6FGmC_6204300474
    • https://blogs.technet.microsoft.com/wincat/2008/08/11/quick-and-dirty-large-scale-eventing-for-windows/
    • https://msdn.microsoft.com/en-us/library/windows/desktop/bb870973(v=vs.85).aspx
    • http://syspanda.com/index.php/2017/03/01/setting-up-windows-event-forwarder-server-wef-domain-part-13/
    • https://www.root9b.com/sites/default/files/whitepapers/R9B_blog_005_whitepaper_01.pdf
    • https://github.com/defendthehoneypot ---- DoD STIG GPOs
  2. You already have an Elastic ELK setup. If not then please see the following (and reach out with any questions):
    • https://cyberwardog.blogspot.com/2017/02/setting-up-pentesting-i-mean-threat_98.html
    • https://github.com/rocknsm/rock
    • https://github.com/Cyb3rWard0g/HELK
    • https://github.com/philhagen/sof-elk
    • http://blog.securityonion.net/2017/12/security-onion-elastic-stack-beta-3.html
    • https://www.digitalocean.com/community/tutorials/how-to-install-elasticsearch-logstash-and-kibana-elk-stack-on-ubuntu-14-04
  3. You have canary files that may peak an un-wanted entities interest.That would result in them opening/copy/etc the file. For the purpose of this article and continuity we will use the file: "c:\users\public\documents\new-login-information.txt"...however, you may use any file that fits your purpose.

    You may accomplish deploying these files in many ways. I recommend creating a file and deploying/pushing it via a GPO (group policy objects) to all or targeted computers.

    Some examples of files would be files that would seem they contain passwords, network diagrams, PE's/EXE's of interest (ie: psexec), and whatever else you can dream.

Solution (overview)


First I will show you what GPO's to deploy in order to enable the windows event logs that will allow us to determine if these canary files have been accessed/read/etc...

Secondly, I will show how to create a WEF subscription to "ONLY" forward the canary files from the first step. This will allow us to use these event logs (that may be of high volume and or network-bandwidth to ship/transfer. This will be useful if you have many small locations with limited bandwidth and or just limited data storage.

Finally, I will show you what the events look like in ELK. Also, will show you the events versus Sysmon and other builtin windows logs.



Solution (detailed)

  1. On your Active Directory (AD) server create a group policy with whatever name you would like to whichever (or all) computers/devices you want to perform the canary monitoring on.
    This group policy will enable Object Access File System Auditing and define what would cause this event to occur via SACLs.
    We define Advanced Audit of Object Access for the File System. Then we define the SACLs in Global Object Access Auditing.


    **
    SACL resource from Microsoft:
    https://msdn.microsoft.com/en-us/library/windows/desktop/aa374872(v=vs.85).aspx
    I recommend NOT naming the GPO similar to my examples.. ie: do not use "canary" or some other such that may tip your hat too easily.
    **
  2. On your WEF server, create a subscription to look for EventID:4663 and the canary files you created in Prerequisite step 3.
    example: we will use "c:\users\public\documents\new-login-information.txt"
    Now we only want (for the sake of bandwidth and storage and event "overload") to get 4663 events that contain our canary files. Therefore, we will use an "advanced" XML WEF subscription (https://blogs.technet.microsoft.com/askds/2011/09/26/advanced-xml-filtering-in-the-windows-event-viewer/)
    as shown via the code here:
    https://gist.github.com/neu5ron/40874b46d4afc642725d0e00e32b3ddc


    **
    You can still use other existing subscriptions for 4663 that you have. This is only an example to collect 4663 with specific canary files. Adding this subscription would not impact or degrade any 4663 subscriptions you already ahve in place.
    **
    If you do not have logstash configs to pull and send the windows events or are specifically looking for ones for windows, I have some here that will get you started or use resources mentioned above in Prereq 2:
    https://github.com/neu5ron/WinLogsZero2Hero
  3. Now lets see what the events look like in Elastic ELK.  Assuming you have a windows log forwarder, whether NXLog (https://nxlog.co/) or WinLog Beats (https://www.elastic.co/downloads/beats/winlogbeat), setup on your WEF server that is sending to your ELK stack.  An example NXLog setup would look like:
    https://gist.github.com/neu5ron/ad82daa2d452f56ba56d3610d5a0124f

In the Kibana (ELK) example I have applied the search to remove "AccessMask:0x20000" to determine if the canary file was actually "read" versus just a directory browsed (ie: SACL check).
Also, because I have included all events with the canary file's name -- I have excluded EventID:4656 due to its similar scenario as 4663.
I also show that Sysmon and Windows Log 4688 will only show if a process spawns original access to the canary file.  For example, if someone opens up command prompt and then views/lists/etc the canary file, then nothing will log that except 4663. These other events only show that I accessed the canary file by explorer from notepad. However, here are all the ways I accessed the canary file that EventID:4663 showed:
  1. Opened in explorer via notepad.
  2. Accessed/read after internet explorer was already opened.
  3. Accessed/read after command prompt was already opened.
  4. Accessed/read after powershell was already openeded.

Notes

This event collection for canary file access will work even if a computer is offline while the file is accessed because once a computer is connected back to your AD the log will be forwarded to WEF and thus into ELK.

False positives may include:
-user just browsing out of curiousty but not with maliciously/malign intent.
-backup software? -- but this should be easy to whitelist by ProcessName != $BackupSoftware
-AV software? -- but this should be easy to whitelist by ProcessName != $AVSoftware

Additional information on EventID:4663
https://github.com/MicrosoftDocs/windows-itpro-docs/blob/master/windows/device-security/auditing/event-4663.md