YARA is an important piece in the defender's chess set. Depending on how you play the game, you can think of YARA as either a bishop or a rook — a powerful weapon in the hands of a threat hunter or a layer that makes the defender's wall a bigger obstacle to overcome.
However, chess is a game that takes a lifetime to master. Each defeat is a lesson. What separates a grand master from a beginner are the experiences accumulated by every wrong move they’ve made. Grand masters have lost more games than beginners have played. Losing is simply a part of the game.
Unfortunately, security defenders don’t have that luxury. Losing as a defender can mean a devastating blow to the organization they are in charge of protecting.
Unlike chess players, security professionals have many powerful aids at their side while playing the game. These can turn them into grandmasters with the ability to see five moves ahead. YARA is just that kind of an aid. But, like chess, it does take time and patience to master.
Hunting with YARA
One way to think about YARA is as a binary data query language. And with the advent of large data lakes hosted in security clouds, the term expanded to big data binary query language. YARA is an expressive way to apply regex matches to raw object content and its associated metadata. Regardless if the data lake objects are structured or not, YARA is the answer when it comes to searching through them.
Hunting with YARA means designing proactive big data queries that expose attacks before they have the chance to reach the organization. While this might sound counterintuitive to some, in chess, the game isn’t over when you form a castle. The fight just continues outside the walls, and its aim is to prevent its fall. How effective your pieces are outside the confines of the castle walls matters just as much as how sturdy that wall is.
There are two types of big data queries that YARA enables, continuous and retrospective. The former anticipates a future activity, while the latter confirms the activity happened recently. Depending on the aim for the hunt, any combination of the two can be applied. And the rule itself, typically, doesn’t need to be specially adjusted to be effective for either of the two.
Hunting rules themselves can be either strict or loose. Strict rules are more appropriate when tracking the activity of a known attacker. On the other hand, loose rules are a perfect way to discover new threats and existing threat variants. However, the looser the rule the more work it takes to go through the matches it produces. There’s a fine balance between the number of conditions the rule must satisfy and the number of matches it ultimately produces. That’s where a big data lake aimed at security research can help the defenders.
Building YARA rules
In this blog, we’ll demonstrate how ReversingLabs can help security teams develop, test, and deploy powerful YARA rules with ease.
Let’s say we want to develop a rule to detect an unknown packer, which we intend to reverse engineer and create an unpacking method for. The first step in this journey is to acquire the necessary samples that are packed with said packer.
Starting from a single sample we have in our possession, we build a new YARA rule using ReversingLabs’ Spectra Analyze solution (formerly A1000). Next, we quickly check and validate our new rule against ReversingLabs’ Spectra Intelligence (formerly TitaniumCloud) threat repository, with matches being returned in real time as they occur. This provides an opportunity to review results immediately and iterate on the rule if necessary.
This is where ReversingLabs’ YARA rule version tracking helps developers. Each rule iteration can be verified to ensure the results meet the desired match expectations. If the rule fails to meet the mark, it can be quickly reverted to a previous version that worked better. This quick rule iteration makes it easy to fine tune a loose hunting rule.
ReversingLabs Spectra Analyze - YARA rule editor history view
After the initial rule starts returning matches from ReversingLabs’ threat repository, which is close to immediately, they can be fetched into the local dataset.
Upon inspecting the results, we notice some unwanted matches due to the byte pattern being matched anywhere within the object. Fortunately, this can be quickly revised, while the rule is still running, to match at the entry point of the Portable Executable only. We revise the rule accordingly, and we also make an additional change to the rule to clarify that the code section should be named “protect”.
We soon realize that we made a typo in this new rule iteration as the section should be named “.protect” (we forgot the dot). Since the locally downloaded objects no longer match the rule as they were meant to, the error is caught immediately. After it’s fixed, another run of hunting can begin. Our scan completes within an hour with the samples required to start working on the unpacker.
ReversingLabs Spectra Analyze - Portable Executable visualization view
This kind of quick iteration during development is imperative for rapid rule creation. Equally important is the fact that the match history persists even when the rule is changed. As such, previous matches can be studied and used to improve the hunting rule further.
ReversingLabs Spectra Analyze - YARA match history select view
Historical matches make catching a typo (like the one we made in the second iteration of our rule), easy. Rule validation is as simple as selecting the previous version and reprocessing available files. This points out any files that are expected to be matched by the rule but aren’t.
ReversingLabs Spectra Analyze - YARA match history reanalysis
Regardless of the type of rule being developed, this workflow is a prerequisite for creating and validating good YARA rules. Developing good rules is extremely important since a bad rule will just generate operational cost while providing no protection benefits.
Given the intent of our example is to create an unpacker for this unknown executable format, it’s a good idea to continuously monitor the rule for matches. While this could be done by logging into the Spectra Analyze console, there’s a better way. Every YARA ruleset can be subscribed to, so when there’s a match, an email notification is sent. This is especially useful for those types of rules that on occasion yield a few, but very important, matches.
ReversingLabs Spectra Analyze - YARA match subscription view
Searching with YARA
YARA is a big data query language that can be easily combined with the advanced search capabilities of Spectra Analyze. While YARA specializes in being an object content-matching language, ReversingLabs’ advanced search feature is a metadata enrichment and correlation language. Things that are simple to express with one can be difficult with the other, and vice versa. That’s why they complement each other well.
It’s not uncommon to start prototyping YARA rules via advanced search queries. Or the opposite, to create a crude YARA rule and filter down its results using search functions.
Given the previous example, where the Portable Executable section name was added in a later rule iteration, it’s easy to demonstrate how advanced search would have helped in this situation.
Since the YARA rule includes a user-defined tag ‘packer’, it’s simple to find the matching file using advanced search capabilities. By expanding the search query with the forgotten section name, we’re able to achieve the same effect as reiterating over the rule. Plus, we’re able to easily filter out all the malicious files. This is very helpful since the packer we’re interested in is predominantly used by clean software, so we can simply filter out all packed files that were infected by viruses.
ReversingLabs Spectra Analyze - Advanced search with YARA tags, file metadata, and classification
ReversingLabs also makes it easy to refine the matches by any number of metadata filters, including the ones that are not part of the raw object contents like extracted properties, timestamps, download locations, classification, similarity, and even associated threat actors. And, with 100+ keywords, over 280 automatically applied system tags, plus support for custom user tags, it’s possible to build more than 500 unique search queries. At this point, the only limitation to combining these queries becomes how wild the imagination is.
Protecting with YARA
Another way to think about YARA is as a pattern-based detection engine. Pattern matching is the most accurate way to detect a threat. But the accuracy of such detections is proportional to its rigidity. The more accurate the rule, the less resilient to changes it is. Likewise, the more accurate the rule, the less unwanted detections it generates. The key to writing good detection rules is finding the balance between the two.
In terms of protection, YARA rules can play a big role in determining the type of malware that gets detected. Accurate pattern matching is a great way to refine heuristic detections into more exact ones. In such systems, heuristic detections act proactively while signatures reinforce their decisions and help to prioritize response.
ReversingLabs’ malware analysis and threat hunting solution is based upon this principle. Which, in addition to the dozen detection technologies employed to detect threats, can also have its classification extended through YARA rules. More so than any other security platform, ReversingLabs allows security teams to integrate their best YARA detection rules natively into its classification logic. This extends beyond simple detection to include naming and scoring threats. The results of this native integration can be seen everywhere from reporting to alerting and advanced search.
Converting existing rules
With ReversingLabs, any existing rule can be quickly converted into a threat-detection rule. By extending the list of tags with the keyword “malicious”, the intent to classify matched objects is signaled. Such matches will influence the solution’s classification logic and convict any sample they are matched on.
While that is a good way to classify objects, there is a better option. With the addition of the “tc_dectection” tag, the ability to name detected threats is unlocked. Fully qualified threat names include a threat type, family name, and its severity. All of these are configured within the rule and are applied as a malicious classification to the objects that match it.
ReversingLabs Spectra Analyze - Threat detection YARA rule syntax
Any existing rule can quickly be converted into a threat detection one. By extending the list of tags with the keyword malicious the intent to classify matched objects is signaled. Such matches will influence platforms classification logic and convict any sample they are matched on.
While that is a good way to classify objects, there is a better option. With the addition of tc_dectection tag the ability to name detected threats is unlocked. Fully qualified threat names include a threat type, family name and its severity. All of which are configured within the rule, and are applied as a malicious classification to the objects that match it.
ReversingLabs Spectra Analyze - Threat detection with a YARA rule
Deploying YARA rules
It's crucial to be able to continuously deploy new protection measures to fight the ever-evolving threat landscape. ReversingLabs empowers security practitioners with an advanced, yet easy-to-use solution for all YARA needs, including the ability to easily import rules, build custom rules, test and evaluate rule efficacy, and deploy rules to protect the organization – all from a single interface. And, just as importantly, users get a solution that can easily scale, which is a key requirement for large enterprises.
Enterprise security teams must monitor and secure many different entry points into the organization, including email, endpoints, cloud storage, mobile, web application uploads, and more, without impacting or slowing down critical business processes. This is where ReversingLabs’ Spectra Detect solution comes into play, which starts with the solution's ability to automatically ingest and analyze high-volumes of files (millions of files per day) from any modern data source – without performance implications. Security teams can then leverage Spectra Detect’s centralized YARA rule management and enterprise-wide YARA scanning capabilities, including custom rule matching and targeted retro-hunts against very large datasets at record scale and speed. No other solution allows YARA matching and pivoting on thousands of characteristics on all extracted files and objects. With ReversingLabs, large-scale YARA rule deployments become simple, allowing security teams to greatly improve threat detection across the organization and close existing security gaps.
Open-source YARA rules
While this blog outlines more advanced defense strategies for those with existing YARA rule deployments, those without any haven’t been forgotten. There’s no better time to start using YARA within your organization than today. Deploying YARA can start with a single, high confidence, threat-detection rule.
To that end, ReversingLabs is making a sizable contribution to threat defender toolboxes by open sourcing its threat-detection YARA rules. The initial public release is composed of more than a hundred rules that are built to detect various Windows and Linux malware families. Once deployed, these rules can detect a multitude of malware downloaders, viruses, trojans, exploits, and ransomware.
ReversingLabs Spectra Analyze - Threat detection via open source YARA rules
These YARA rules are built with the goal of providing zero false-positive detections. To achieve this goal, and ensure their quality, these rules are put through rigorous testing in the ReversingLabs threat repository, which consists of over 40 billion unique binaries. Only the rules that meet this strict criteria are considered for publication.
As threat-detection rules, these YARA rules make an attribution to both the malware type and its family, or variety. With such results, defenders can quickly pivot from a malware detection event to threat response. Knowing that a YARA rule has detected ransomware with high degree of precision can mean the difference between a prevented attack and the one that slips by because it was left waiting for investigation to determine its importance.
Check out the ReversingLabs GitHub repository for new and updated rules that detect the latest threats.
The time is now to start leveling up your YARA game, and ReversingLabs can help.
Watch our videos:
- Whiteboard video: Identifying File Content with YARA Rules
- How To video: How to Hunt for Threats Using YARA Rules
Read our blogs:
Learn more about ReversingLabs' solutions:
Keep learning
- Find the best building blocks for your next app with RL's Spectra Assure Community, where you can quickly search the latest safe packages on npm, PyPI and RubyGems.
- Get up to speed on securing AI/ML systems and software with our Special Report. Plus, see the Webinar: The MLephant in the Room.
- Learn about complex binary analysis and why it is critical to software supply chain security in our Special Report. Plus: Take a deep dive with RL's white paper.
Explore RL's Spectra suite: Spectra Assure for software supply chain security, Spectra Detect for scalable file analysis, Spectra Analyze for malware analysis and threat hunting, and Spectra Intelligence for reputation data and intelligence.