[en] Recently, there is a surge in ransomware activities that encrypt users’ sensitive data and demand bitcoins for ransom payments to conceal the criminal’s identity. It is crucial for regulatory agencies to identify as many ransomware addresses as possible to accurately estimate the impact of these ransomware activities. However, existing methods for detecting ransomware addresses rely primarily on time-consuming data collection and clustering heuristics, and they face two major issues: (1) The features of an address itself are insufficient to accurately represent its activity characteristics, and (2) the number of disclosed ransomware addresses is extremely less than the number of unlabeled addresses. These issues lead to a significant number of ransomware addresses being undetected, resulting in a substantial underestimation of the impact of ransomware activities.
To solve the above two issues, we propose an optimized ransomware address detection method based on Bitcoin transaction relationships, named
XRAD
, to detect more ransomware addresses with high performance. To address the first one, we present a cascade feature extraction method for Bitcoin transactions to aggregate features of related addresses after exploring transaction relationships. To address the second one, we build a classification model based on Positive-unlabeled learning to detect ransomware addresses with high performance. Extensive experiments demonstrate that
XRAD
significantly improves average accuracy, recall, and F1 score by 15.07%, 19.71%, and 34.83%, respectively, compared to state-of-the-art methods. In total,
XRAD
detects 120,335 ransomware activities from 2009 to 2023, revealing a development trend and average ransom payment per year that aligns with three reports by FinCEN, Chainalysis, and Coveware.
Centre de recherche :
NCER-FT - FinTech National Centre of Excellence in Research
Disciplines :
Sciences informatiques
Auteur, co-auteur :
Wang, Kai ; School of Computer Science, Fudan University, Shanghai, China
Tong, Michael ; software school, Fudan University, Shanghai, China
Cuneyt Gurcan Akcora, Yitao Li, Yulia R. Gel, and Murat Kantarcioglu. 2020. BitcoinHeist: Topological data analysis for ransomware prediction on the bitcoin blockchain. In Proceedings of the 29th International Joint Conference on Artificial Intelligence (IJCAI'20). ijcai.org, 4439-4445.
Qasem Abu Al-Haija and Abdulaziz A. Alsulami. 2021. High performance classification model to identify ransomware payments for heterogeneous bitcoin networks. Electronics 10, 17 (2021), 2113.
Elli Androulaki, Ghassan Karame, Marc Roeschlin, Tobias Scherer, and Srdjan Capkun. 2013. Evaluating user privacy in bitcoin. In Proceedings of the 17th International Conference on Financial Cryptography and Data Security (FC'13). Springer, 34-51.
Massimo Bartoletti, Barbara Pes, and Sergio Serusi. 2018. Data mining for detecting bitcoin Ponzi schemes. In Proceedings of the Crypto Valley Conference on Blockchain Technology (CVCBT'18). IEEE, 75-84.
BBC. 2022. Bitcoin Becomes Official Currency in Central African Republic. Retrieved from https://www.bbc.com/ news/world-africa-61248809
Liron Bergman and Yedid Hoshen. 2020. Classification-based anomaly detection for general data. In Proceedings of the 8th International Conference on Learning Representations (ICLR'20). OpenReview.net.
Stefano Bistarelli,Matteo Parroccini, and Francesco Santini. 2018. Visualizing bitcoin flows of ransomware:WannaCry one week later. In Proceedings of the 2nd Italian Conference on CyberSecurity (ITASEC'18). CEUR-WS.org.
Lorenzo Brigato and Luca Iocchi. 2020. A close look at deep learning with small data. In Proceedings of the 25th International Conference on Pattern Recognition (ICPR'20). IEEE, 2490-2497.
Jacob Bunge. 2021. JBS Paid $11 Million to Resolve Ransomware Attack. Retrieved fromhttps://www.wsj.com/articles/ jbs-paid-11-million-to-resolve-ransomware-attack-11623280781
The United Kingdom's National Cyber Security Centre. 2021. Joint AdvisoryHighlights Increased Globalised Threat of Ransomware. Retrieved from https://www.ncsc.gov.uk/news/joint-advisory-highlights-increased-globalised-threatof-ransomware
Yidong Chai, Yonghang Zhou, Weifeng Li, and Yuanchun Jiang. 2022. An explainable multi-modal hierarchical attention model for developing phishing threat intelligence. IEEE Trans. Depend. Secure Comput. 19, 2 (2022), 790-803.
Inc. Chainalysis. 2023. Chainalysis: The Blockchain Data Platform. Retrieved from https://www.chainalysis.com
Chainanalysis. 2020. THE 2020 STATE OF CRYPTO CRIME: Everything You Need to Know about Darknet Markets, Exchange Hacks, Money Laundering and More. Retrieved from https://go.chainalysis.com/rs/503-FAP-074/images/ 2020-Crypto-Crime-Report.pdf
Chainanalysis. 2021. The 2021 Crypto Crime Report: Everything You Need to Know about Ransomware, Darknet Markets, and More. Retrieved from https://go.chainalysis.com/rs/503-FAP-074/images/Chainalysis-Crypto-Crime-2021.pdf
Chainanalysis. 2022. The 2022 Crypto Crime Report: Original Data and Research into Cryptocurrency-based Crime. Retrieved from https://go.chainalysis.com/rs/503-FAP-074/images/Crypto-Crime-Report-2022.pdf
Chainanalysis. 2023. The 2023 Crypto Crime Report: Everything You Need to Know about Cryptocurrency-based Crime. Retrieved from https://go.chainalysis.com/rs/503-FAP-074/images/Crypto_Crime_Report_2023.pdf
Jing Chen, Chiheng Wang, Ziming Zhao, Kai Chen, Ruiying Du, and Gail-Joon Ahn. 2018. Uncovering the face of Android ransomware: Characterization and real-time detection. IEEE Trans. Inf. Forens. Secur. 13, 5 (2018), 1286-1300.
Tianqi Chen and Carlos Guestrin. 2016. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd International Conference on Knowledge Discovery and Data Mining (KDD'16). ACM, 785-794.
Weili Chen, Xiongfeng Guo, Zhiguang Chen, Zibin Zheng, and Yutong Lu. 2020. Phishing scam detection on ethereum: Towards financial security for blockchain ecosystem. In Proceedings of the 29th International Joint Conference on Artificial Intelligence (IJCAI'20). ijcai.org, 4506-4512.
Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, Rohan Anil, Zakaria Haque, Lichan Hong, Vihan Jain, Xiaobing Liu, and Hemal Shah. 2016. Wide & deep learning for recommender systems. In Proceedings of the 1st Workshop on Deep Learning for Recommender Systems (DLRS'16). ACM, 7-10.
Fabrizio Cicala and Elisa Bertino. 2022. Analysis of encryption key generation in modern crypto ransomware. IEEE Trans. Depend. Secure Comput. 19, 2 (2022), 1239-1253.
Mauro Conti. 2019. On the Economic Significance of Ransomware Campaigns: A Bitcoin Transactions Perspective. Retrieved from https://spritz.math.unipd.it/projects/btcransomware/
Mauro Conti, Ankit Gangwal, and Sushmita Ruj. 2018. On the economic significance of ransomware campaigns: A Bitcoin transactions perspective. Comput. Secur. 79 (2018), 162-189.
Coveware. 2021. 2018-2022 Ransomware Statistics and Facts. Retrieved from https://www.comparitech.com/ antivirus/ransomware-statistics/
Simon R. Davies, Richard Macfarlane, andWilliam J. Buchanan. 2021. Differential area analysis for ransomware attack detection within mixed file datasets. Comput. Secur. 108 (2021), 102377.
Dominic Deuber, Viktoria Ronge, and Christian Rückert. 2022. SoK: Assumptions underlying cryptocurrency deanonymizations-A taxonomy for scientific experts and legal practitioners. IACR Cryptol. ePrint Arch. (2022), 763.
Sudipti Dhawan and Bhawna Narwal. 2019. Unfolding the mystery of ransomware. In Proceedings of the International Conference on Innovative Computing and Communications (ICICC'19). Springer, 25-32.
Harris Drucker, Christopher J. C. Burges, Linda Kaufman, Alexander J. Smola, and Vladimir Vapnik. 1996. Support vector regression machines. In Proceedings of the Advances in Neural Information Processing Systems (NIPS'96). MIT Press, 155-161.
Charles Elkan and Keith Noto. 2008. Learning classifiers from only positive and unlabeled data. In Proceedings of the 14th International Conference on Knowledge Discovery and Data Mining (KDD'08). ACM, 213-220.
FinCEN. 2021. Ransomware Trends in Bank Secrecy Act Data between July 2021 and December 2021. Retrieved from https://www.fincen.gov/sites/default/files/2022-11/Financial%20Trend%20Analysis_Ransomware%20FTA%202_508%20FINAL.pdf
Jerome H. Friedman. 2001. Greedy function approximation: A gradient boosting machine. Ann. Stat. 29 (2001), 1189-1232.
GraphSense. 2023. GraphSense Public TagPacks. Retrieved from https://github.com/graphsense/graphsense-tagpacks
Weili Han, Dingjie Chen, Jun Pang, Kai Wang, Chen Chen, Dapeng Huang, and Zhijie Fan. 2021. Temporal networks based industry identification for bitcoin users. In Proceedings of the 16th International Conference on Wireless Algorithms, Systems, and Applications (WASA'21), Vol. 12937. Springer, 108-120.
Julio C. Hernandez-Castro, Eerke Albert Boiten, and Magali F. L. Barnoux. 2014. Second Kent Cyber Security Survey. Retrieved from https://kar.kent.ac.uk/52891/
Yining Hu, Suranga Seneviratne, Kanchana Thilakarathna, Kensuke Fukuda, and Aruna Seneviratne. 2019. Characterizing and detecting money laundering activities on the bitcoin network. CoRR abs/1912.12060 (2019).
Danny Yuxing Huang. 2018. Ransomware-public-data. Retrieved from https://hdanny.org/ransomware-public-data/
Danny Yuxing Huang, Maxwell Matthaios Aliapoulios, Vector Guo Li, Luca Invernizzi, Elie Bursztein, Kylie McRoberts, Jonathan Levin, Kirill Levchenko, Alex C. Snoeren, and Damon McCoy. 2018. Tracking ransomware end-to-end. In Proceedings of the Symposium on Security and Privacy (S&P'18). IEEE, 618-631.
Zhengjie Huang, Yunyang Huang, Peng Qian, Jianhai Chen, and Qinming He. 2023. Demystifying bitcoin address behavior via graph neural networks. In Proceedings of the 39th IEEE International Conference on Data Engineering (ICDE'23). IEEE, 1747-1760.
Newegg Inc. 2022. Using Cryptocurrencies on Newegg. Retrieved fromhttps://kb.newegg.com/knowledge-base/usingcrypto-on-newegg/
Aleŝ Janda. 2021. Bitcoin Block Explorer with Address Grouping and Wallet Labeling. Retrieved from https://www. walletexplorer.com
Manel Jerbi, Zaineb Chelly Dagdia, Slim Bechikh, and Lamjed Ben Said. 2020. On the use of artificialmalicious patterns for Android malware detection. Comput. Secur. 92 (2020), 101743.
Harry A. Kalodner, Malte Möser, Kevin Lee, Steven Goldfeder, Martin Plattner, Alishah Chator, and Arvind Narayanan. 2020. BlockSci: Design and applications of a blockchain analysis platform. In Proceedings of the 29th USENIX Security Symposium (USENIX Security'20). USENIX Association, 2721-2738.
Kota Kanemura, Kentaroh Toyoda, and Tomoaki Ohtsuki. 2019. Identification of darknet markets' bitcoin addresses by voting per-address classification results. In Proceedings of IEEE International Conference on Blockchain and Cryptocurrency (ICBC'19). IEEE, 154-158.
Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. 2017. Light-GBM: A highly efficient gradient boosting decision tree. In Proceedings of Annual Conference on Neural Information Processing Systems (NeurIPS'17). 3146-3154.
Amin Kharaz, Sajjad Arshad, Collin Mulliner, William Robertson, and Engin Kirda. 2016. UNVEIL: A large-scale, automated approach to detecting ransomware. In Proceedings of the 25th USENIX Security Symposium (USENIX Security'16). USENIX Association, 757-772.
Mark Kolakowski. 2021. El Salvador Becomes Bitcoin Laboratory as First Nation to Adopt It as Legal Tender. Retrieved from https://www.investopedia.com/el-salvador-accepts-bitcoin-as-legal-tender-5200470
M. Satheesh Kumar, Jalel Ben-Othman, and K. G. Srinivasagan. 2018. An investigation onWannaCry ransomware and its detection. In Proceedings of the IEEE Symposium on Computers and Communications (ISCC'18). IEEE, 1-6.
A. O. Kaspersky Lab. 2021. WannaCry: Are you Safe? Retrieved from https://www.kaspersky.com/blog/wannacryransomware/ 16518
Kevin Liao, Ziming Zhao, Adam Doupé, and Gail-Joon Ahn. 2016. Behind closed doors: Measurement and analysis of CryptoLocker ransoms in Bitcoin. In Proceedings of the APWG Symposium on Electronic Crime Research (eCrime'16). IEEE, 1-13.
Yu-Jing Lin, Po-Wei Wu, Cheng-Han Hsu, I-Ping Tu, and ShihWei Liao. 2019. An evaluation of bitcoin address classification based on transaction history summarization. In Proceedings of IEEE International Conference on Blockchain and Cryptocurrency (ICBC'19). IEEE, 302-310.
MalwareHunterTeam. 2022. MalwareHunterTeam. Retrieved from https://malwarehunterteam.com/
Sarah Meiklejohn, Marjori Pomarole, Grant Jordan, Kirill Levchenko, Damon McCoy, Geoffrey M. Voelker, and Stefan Savage. 2013. A fistful of bitcoins: Characterizing payments among men with no names. In Proceedings of the Internet Measurement Conference (IMC'13). ACM, 127-140.
Nicole Perlroth Michael D. Shear and Clifford Krauss. 2021. Colonial Pipeline Paid Roughly $5 Million in Ransom to Hackers. Retrieved from https://www.nytimes.com/2021/05/13/us/politics/biden-colonial-pipeline-ransomware.html
Carina Mood. 2009. Logistic regression: Why we cannot do what we think we can do, and what we can do about it. Eur. Sociol. Rev. 26, 1 (2009), 67-82.
Fantine Mordelet and Jean-Philippe Vert. 2014. A bagging SVM to learn from positive and unlabeled examples. Pattern Recog. Lett. 37 (2014), 201-209.
Satoshi Nakamoto. 2009. Bitcoin: A Peer-to-peer Electronic Cash System. Retrieved from https://bitcoin.org/bitcoin. pdf
nopara73. 2020. Ransomware-public-data. Retrieved from https://github.com/nopara73/WasabiVsSamourai/
The United States Department of the Treasury. 2023. Finanical Crimes Enforcement Network. Retrieved from https:// www.fincen.gov
Masarah Paquet-Clouston. 2018. Ransomware in the Bitcoin Ecosystem. Retrieved from https://github.com/behas/ ransomware-dataset/
Masarah Paquet-Clouston, Bernhard Haslhofer, and Benoit Dupont. 2019. Ransomware payments in the Bitcoin ecosystem. J. Cybersec. 5, 1 (2019), tyz003.
Sergio Pastrana and Guillermo Suarez-Tangil. 2019. A first look at the crypto-mining malware ecosystem: A decade of unrestricted wealth. In Proceedings of the Internet Measurement Conference (IMC'19). ACM, 73-86.
Michal Piskozub, Fabio De Gaspari, Freddie Barr-Smith, Luigi V. Mancini, and Ivan Martinovic. 2021. MalPhase: Finegrained malware detection using network flow data. In Proceedings of the ACM Asia Conference on Computer and Communications Security (AsiaCCS'21). ACM, 774-786.
Stijn Pletinckx, Cyril Trap, and Christian Doerr. 2018. Malware coordination using the blockchain: An analysis of the cerber ransomware. In Proceedings of IEEE Conference on Communications and Network Security (CNS'18). IEEE, 1-9.
J. Ross Quinlan. 1986. Induction of decision trees. Mach. Learn. 1, 1 (1986), 81-106.
Fergal Reid and Martin Harrigan. 2011. An analysis of anonymity in the bitcoin system. In Proceedings of the 3rd International Conference on Social Computing (SocialCom'11). IEEE, 1318-1326.
Bernhard Schölkopf, John C. Platt, John Shawe-Taylor, Alexander J. Smola, and Robert C.Williamson. 2001. Estimating the support of a high-dimensional distribution. Neural Comput. 13, 7 (2001), 1443-1471.
Sergio Serusi. 2020. BitcoinAbuse Dataset. Retrieved from https://doi.org/10.7910/DVN/SMPQBQ
Guosong Sun and Quan Qian. 2021. Deep learning and visualization for identifying malware families. IEEE Trans. Depend. Secure Comput. 18, 1 (2021), 283-295.
David M. J. Tax and Robert P. W. Duin. 2004. Support vector data description. Mach. Learn. 54, 1 (2004), 45-66.
Kentaroh Toyoda, P. Takis Mathiopoulos, and Tomoaki Ohtsuki. 2019. A novel methodology for HYIP operators' bitcoin addresses identification. IEEE Access 7 (2019), 74835-74848.
Kentaroh Toyoda, Tomoaki Ohtsuki, and P. Takis Mathiopoulos. 2018. Multi-class bitcoin-enabled service identification based on transaction history summarization. In Proceedings of the International Conference on Internet of Things (iThings'18) and Green Computing and Communications (GreenCom'18) and Cyber, Physical and Social Computing (CPSCom'18) and Smart Data (SmartData'18). IEEE, 1153-1160.
Rohit Valecha, Pranali Mandaokar, and H. Raghav Rao. 2022. Phishing email detection using persuasion cues. IEEE Trans. Depend. Secure Comput. 19, 2 (2022), 747-756.
ChengWang and Hangyu Zhu. 2022. Representing fine-grained co-occurrences for behavior-based fraud detection in online payment services. IEEE Trans. Depend. Secure Comput. 19, 1 (2022), 301-315.
Kai Wang, Yakun Cheng, Michael Wen Tong, Zhenghao Niu, Jun Pang, and Weili Han. 2024. Exploring unconfirmed transactions for effective bitcoin address clustering. In Proceedings of the ACM Web Conference (WWW'24). ACM, 1880-1891.
KaiWang, Jun Pang, Dingjie Chen, Yu Zhao, Dapeng Huang, Chen Chen, andWeili Han. 2022. A large-scale empirical analysis of ransomware activities in bitcoin. ACM Trans. Web 16, 2 (2022), 1-29.
AlexandraWinter. 2021. Use Crypto to Buy Dad a Father's Day Gift This Year. Retrieved from https://www.jomashop. com/blog/articles/you-can-use-bitcoin
Lei Wu, Yufeng Hu, Yajin Zhou, Haoyu Wang, Xiapu Luo, Zhi Wang, Fan Zhang, and Kui Ren. 2021. Towards understanding and demystifying bitcoin mixing services. In Proceedings of the Web Conference (WWW'21). ACM/IW3C2, 33-44.
Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and Philip S. Yu. 2021. A comprehensive survey on graph neural networks. IEEE Trans. Neural Netw. Learn. Syst. 32, 1 (2021), 4-24.
Jingkang Yang, Kaiyang Zhou, Yixuan Li, and Ziwei Liu. 2021. Generalized out-of-distribution detection: A survey. CoRR abs/2110.11334 (2021).
Yang Zhao, Hao Zhang, and Xiuyuan Hu. 2022. Penalizing gradient norm for efficiently improving generalization in deep learning. In Proceedings of the International Conference on Machine Learning (ICML'22). PMLR, 26982-26992.