Writing Custom Rules for YARA


The Structure of YARA Rules

Before attempting to create custom YARA rules, it is helpful to begin with a basic understanding of their structure. YARA rules are easy to write and understand. These rules generally have an identifier and are composed of three distinct sections: meta, strings definition, and condition. The meta and strings definition sections can be omitted if the rule does not have any additional data or does not rely on any strings, but the condition section is always required.

rule Identifier
{
    meta:
      creator = "name here"
      date = "date here"
      description = "description here"
      
    strings:
      $string_a = "text here"
      $string_b = "text here"
      
    condition:
      $string_a and $string_b
}

Rule Identifier

Each rule in YARA starts with the keyword rule, followed by a rule identifier. Rule identifiers are case sensitive and cannot exceed 128 characters.

rule Identifier
{
  ...
}

Note: Some keywords are reserved in the systems and therefore cannot be used as identifiers.

Rule Meta

In the meta section, the user can provide details about the rule he or she is creating, such as author, date of creation, description, version, etc. The meta section is not included in the sample analysis.

rule Identifier
{
  meta:
    author = "rangeforce"
    description = "description here"
    version = "0.1"
    
...

You can find the meta section of the rule by running YARA with the option: -m --print-meta

Rule Strings

Strings Analysis

YARA rules are based on contained strings, so there is generally no need to reverse-engineer the samples.

Extract all strings from the sample with the strings tool, and separate the good strings from the suspicious ones. It is a time-consuming, manual and repetitive process. To simplify this, look into the options outlined in the automated YARA Rule Generation material.

Strings Section

In the strings section, users can declare string variables and assign values to them. Each variable is indicated by the $ sign and followed by the variable name. It is the right place to put suspicious strings from the sample.

rule Identifier
{
  meta:
    author = "rangeforce"
    description = "description here"
    version = "0.1"
    
  strings:
      $string_a = "text here"
      $string_b = "text here"
  
...

To display all the strings that matched in the sample, use the -s --print-strings option.

For more information on this topic, click on the following link to access guidelines for generating the best performance for YARA rules.

Rule Condition

Rule condition is the expression of what the user is looking for in the sample. These conditions can contain Boolean and relational operators. In addition, arithmetic and bitwise operators can be used on numerical expressions (as shown in the example below).

rule Identifier
{
    meta:
      creator = "name here"
      date = "date here"
      description = "description here"
      
    strings:
      $string_a = "text here"
      $string_b = "text here"
      
    condition:
      $string_a and $string_b
}

More examples of the operators and how they are used in conditions can be found in the official YARA documentation. Also, there are ways to optimize the conditions in the Performance Guidelines.

Rule Application

The best test for a YARA rule is how well it can scan a sample. To execute the rule from the command line, use the following command:

yara [OPTIONS] RULE_FILE TARGET

The RULE_FILE points to a file that stores the YARA rule that you want to use, while TARGET points to a file, folder or process to be scanned.

Conclusion

Over time, users will gain rule-writing experience and create rules that work well for malware detections and classification. When users produce effective new rules, they should consider contributing that work to the broader malware research community. After all, it’s the right thing to do.

References:

Mykyta Zaitsev