Create YARA Rules In Python
Overview: A library to write YARA rules in Python now exists.
Get the code: https://github.com/matonis/yara_tools
Check out the examples: https://github.com/matonis/yara_tools/examples
Today I’m introducing yara_tools, a library that I’ve created to help security researchers create simple or complex YARA rules in Python. Being that it is written in Python, it integrates anywhere Python already does. If you don’t know about YARA, please read the docs.
YARA is the bread and butter of malware research & defense… and yara_tools is its butter knife.
yara_tools is very easy to use but does introduce some new & unique concepts . I think you’ll appreciate them when it comes to automating your workflows. I encourage you to check out the examples on the GitHub page.
Diving In: What Makes yara_tools Unique?
yara_tools lets you do prrreeetttyyy much anything you’d want to do when creating a YARA rule but with a few perks.
- The boring stuff that goes into making a rule is pretty much automated (variable names & integer increments).
- The annoying stuff such as dealing with binary data, for loops, and adding metadata & comments is now easy.
- The really frustrating stuff like creating and maintaining complex conditions & relationships are now a little more intuitive.
Creating A Rule
Check out: Example 1 — “Hello World”
Making a rule is simple. Each rule is constructed with a single constructor and returned as an object. Every aspect of a rule can be controlled via a combination of functions and parameters.
As long as your rule has some strings or a condition, yara_tools.build_rule() will give you a rule in return.
Helping Out With Automation
One of the key drivers behind extending yara_tools was to assist with forward-looking research.
Often, you may find a decoding/encoding algorithm, a DGA, or a flaw in the logic of the malware author and succeed in brute forcing a number of things to expand coverage or do some hunting. I built as many automatic facilities within yara_tools so you can throw it as much data as you need and be able to walk away with a rule.
Spend more time coding solutions & less time kicking your screen while formatting a rule.
Expanding Context & Delivery
One of the more boring features of yara_tools that is deceptively useful are its facilities for adding comments. This is especially useful when you need to return to an obscure rule and understand what the heck is going on.
- Add as many meta fields as you like.
- Enjoy the ability to add in-line comments to strings.
Handling Binary Data
Pass yara_tools.add_binary_strings() a raw piece of binary data and get an automatically-added hex-string in return.
Complex Conditions: Introduction to “Condition Groups”
Check Out: Example 5 — (Kaspersky & STONEDRILL)
My ultimate benchmark in measuring the capabilities of yara_tools was the ability to create complex conditions.
I spent a lot of time studying very obscure rules that I, and others in the community have written and went through a few scenarios to ensure that I could at least create the same rule both programmatically and syntactically.
In my introduction, I mention that a combination of factors go into authoring a decent YARA rule: Here’s what a fictitious yet horrifyingly real-world condition usually looks like from an analyst who has studied a corpus of related malware and has determined that a unique combination of strings make it unique:
> (all of ($s*) and 3 of ($x)) or (any of ($y*) and 1 of ($s*)) or ($s1 and #o >= 42 and any of ($a,$b,$c,$d))
To facilitate complex conditions like these, I aimed to replicate how I infer humans think when composing a YARA rule.
Enter “Condition Groups.”
In the opening image for this section, I illustrate a complex set of conditions & relationships to demonstrate how conditions can be used, re-used, related, and chained. Each node represents a group of conditions and each line represents a nested relationship.
During your journey of authoring a YARA signature, you may have a set of conditions that represent a singular set of attributes (a “thing”) that makes something unique. You may want to use that singular “thing” in another group of “things” and those “things” as part of another set of “things” and so on. Instead of re-creating the same condition multiple times, you can create one group of conditions and which can later be referenced to another distinct group(s) of conditions.
This concept is referred to in yara_tools as a Condition Group.
Condition Groups are containers for conditions.
- A condition group has a single configurable boolean assigned to all expressions within it.
- A condition group may have one or many expressions and conditions within them.
- A condition group can be negated/inverted (not modifier)
- An expression can be used within many condition groups.
- Condition groups can be related to one another and nested.
Ideally, you should use conditions groups to create expressions that are of similar purpose and spirit.
- For example, a common anchoring set of conditions & expressions often revolve around validating file formats (i.e. PECOFF, MZ header, DOS string, sections, etc.)
yara_tools allows you to create relationships between condition groups by defining “parent_group” in the corresponding call, and as illustrated, condition groups can have a 1:N relationship.
Always remember, define your groups incrementally as they would ideally be represented on paper and please… please… Keep Your Condition Groups Simple (KYCGS).
Hidden Gotcha’s & Last Words
First and foremost, yara_tools does not validate your inputs or the rules you make. It knows the format, it gives you functions to freely add things to it. Given the expansion of development YARA has seen in recent years, this was by design.
Secondly, condition groups are new and might cause some headaches. So, I give you a piece of advice that you must not forget: conditions are order-based.
- The order in which you add conditions matters.
- The order in which you add condition groups also matter.
- There is a concept of a “global condition”, that is, a condition that is not a condition group. Those are always added first. Condition Groups are added after.
yara_tools was initially developed about four years ago when I built the beta version of ripPEv2 (get the OG version of ripPE here!); however, it’s seen some limited testing. I dev’d it & use it but it still has some way to go in terms of being fully robust. There will be bugs.
Fork & please submit a pull request!