We detected academic researchers usually working with very poor malware sets or having problems to get a good malware set. We want the academic field to work with better samples, so that their researches are better and we all get a better malware, adware and PUP detection.
What is CARMA?
ElevenPaths Curated Android Malware APK Set (CARMA) is a free service provided by the Innovation and Labs area of ElevenPaths. It provides a free set of malware samples, adware and other potentially dangerous files collected for the Android operating system. These samples may be exclusively used for research or academic purposes, so their use for any other purpose is forbidden. These sets are intended to provide quality samples that may be used for analysis within expert systems, Machine Learning, artificial intelligence or any method that allows improving the future detection of this kind of threats.
We provide a set of complete malware samples in their original and unaltered format, sorted by year, origin and type of threat. From Google Play and other markets, PUP, adware, malware and so on. Classified by years since 2017. And also goodware!
How has the classification been made?
Classifying malware based on antivirus has advantages, but disadvantages as well. If you train a system with the findings of an antivirus, you will only be able to learn at most what such antivirus knows or be closer to similar results. To make matters worse, if the samples used for the training and learning are unclearly labeled (and this usually happens in several antivirus engines) systems may learn from such different elements as an adware or a Trojan and consequently lose effectiveness.
For our set, we have worked on the basis of some renowned antivirus engines, but in addition we have applied other interesting rules. For instance, an agreement on the labels when assessing the threat, or that they were not overlapped sets. Moreover, we have considered more variables: the fact that the markets have removed the samples, that they have been on it long enough, or the consensus of several technologies on categorization.
The system is not perfect (it will never be), but it makes up for some usual flaws that we have found. If in addition we take into account the fact that we provide a significant number of samples (something appreciated by analysts), we are able to mitigate such flaw. The goal is a quality research in the field of malware detection for Android. The sets can only be freely used for academic purposes, and under no circumstances for profit purposes.
I am an organization that conducts research, how do I get it?
CARMA comes as an extension of our more complete service for researchers Tacyt. You only need to warrant its use via this form. We will reply to you manually. You must sign an engagement and understanding document where the only commitment is mutual acknowledgement.
All the info here: https://tacyt.elevenpaths.com/carma