20News. 2008. 20News. http://qwone.com/~jason/20Newsgroups/. Online; accessed January 2021.
Ashutosh Adhikari, Achyudh Ram, Raphael Tang, and Jimmy Lin. 2019. DocBERT: BERT for Document Classification. CoRR abs/1904.08398 (2019).
Charu C Aggarwal and ChengXiang Zhai. 2012. A survey of text classification algorithms. In Mining text data. Springer, 163-222.
Emily Alsentzer, John Murphy, William Boag, Wei-Hung Weng, Di Jindi, Tristan Naumann, and Matthew McDermott. 2019. Publicly Available Clinical BERT Embeddings. In Proceedings of the 2nd Clinical Natural Language ProcessingWorkshop. Association for Computational Linguistics, Minneapolis, Minnesota, USA, 72-78. https://doi.org/10.18653/v1/W19-1909
Chaitanya Anne, Avdesh Mishra, Tamjidul Hoque, and Shengru Tu. 2018. Multiclass patent document classification. Artif. Intell. Research 7 (2018), 1-14.
Dogu Araci. 2019. FinBERT: Financial Sentiment Analysis with Pre-Trained Language Models. arXiv preprint arXiv:1908.10063 (2019).
Iz Beltagy, Kyle Lo, and Arman Cohan. 2019. SciBERT: A pretrained language model for scientific text. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 3606-3611.
Alexis Conneau and Guillaume Lample. 2019. Cross-lingual Language Model Pretraining. In Advances in Neural Information Processing Systems. 7057-7067.
Matthias Damaschk, Tillmann Dönicke, and Florian Lux. 2019. Multiclass Text Classification on Unbalanced, Sparse and Noisy Data. In Proceedings of the First NLPL Workshop on Deep Learning for Natural Language Processing. Linköping University Electronic Press, Turku, Finland, 58-65. https://www.aclweb.org/anthology/W19-6207
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1. 4171-4186.
Emad Elwany, Dave Moore, and Gaurav Oberoi. 2019. BERT Goes to Law School: Quantifying the Competitive Advantage of Access to Large Legal Corpora in Contract Understanding. CoRR abs/1911.00473 (2019). arXiv:1911.00473 http: //arxiv.org/abs/1911.00473
Eibe Frank and Remco R. Bouckaert. 2006. Naive Bayes for Text Classification with Unbalanced Classes. In Knowledge Discovery in Databases: PKDD 2006, Johannes Förnkranz, Tobias Scheffer, and Myra Spiliopoulou (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 503-510.
Derek Greene and Pádraig Cunningham. 2006. Practical solutions to the problem of diagonal dominance in kernel document clustering. In Proceedings of the 23rd international conference on Machine learning. 377-384.
Kexin Huang, Jaan Altosaar, and Rajesh Ranganath. 2019. Clinicalbert: Modeling clinical notes and predicting hospital readmission. arXiv preprint arXiv:1904.05342 (2019).
Alon Jacovi, Oren Sar Shalom, and Yoav Goldberg. 2018. Understanding Convolutional Neural Networks for Text Classification. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics, Brussels, Belgium, 56-65. https://doi.org/10.18653/v1/W18-5408
Sang-Bum Kim, Kyoung-Soo Han, Hae-Chang Rim, and Sung-Hyon Myaeng. 2006. Some Effective Techniques for Naive Bayes Text Classification. Knowledge and Data Engineering, IEEE Transactions on 18 (12 2006), 1457-1466. https: //doi.org/10.1109/TKDE.2006.180
Kamran Kowsari, Kiana Jafari Meimandi, Mojtaba Heidarysafa, Sanjana Mendu, Laura Barnes, and Donald Brown. 2019. Text classification algorithms: A survey. Information 10, 4 (2019), 150.
Siwei Lai, Liheng Xu, Kang Liu, and Jun Zhao. 2015. Recurrent convolutional neural networks for text classification. In Twenty-ninth AAAI conference on artificial intelligence.
Jieh-Sheng Lee and Jieh Hsiang. 2019. PatentBERT: Patent Classification with Fine-Tuning a pre-Trained BERT Model. CoRR abs/1906.02124 (2019). arXiv:1906.02124 http://arxiv.org/abs/1906.02124
Jinhyuk Lee, Wonjin Yoon, Sungdong Kim, Donghyeon Kim, Sunkyu Kim, Chan Ho So, and Jaewoo Kang. 2020. BioBERT: A pre-Trained biomedical language representation model for biomedical text mining. Bioinformatics 36, 4 (2020), 1234-1240.
Baoli Li and Carl Vogel. 2010. Improving Multiclass Text Classification with Error-Correcting Output Coding and Sub-class Partitions. In Advances in Artificial Intelligence, Atefeh Farzindar and Vlado Kešelj (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 4-15.
Fei Li, Yonghao Jin, Weisong Liu, Bhanu Pratap Singh Rawat, Pengshan Cai, and Hong Yu. 2019. Fine-Tuning Bidirectional Encoder Representations From Transformers (BERT)-Based Models on Large-Scale Electronic Health Record Notes: An Empirical Study. JMIR medical informatics 7, 3 (2019), e14830.
Clavance Lim. 2019. An Evaluation of Machine Learning Approaches to Natural Language Processing for Legal Text Classification. Master?s thesis. Imperial College London.
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019).
Ronny Luss and Alexandre d?Aspremont. 2015. Predicting abnormal returns from news using text classification. Quantitative Finance 15, 6 (2015), 999-1012.
Pekka Malo, Ankur Sinha, Pekka Korhonen, Jyrki Wallenius, and Pyry Takala. 2014. Good debt or bad debt: Detecting semantic orientations in economic texts. Journal of the Association for Information Science and Technology 65, 4 (2014), 782-796.
Andrew McCallum, Kamal Nigam, et al. 1998. A comparison of event models for naive bayes text classification. In AAAI-98 workshop on learning for text categorization, Vol. 752. Citeseer, 41-48.
Tom McCoy, Ellie Pavlick, and Tal Linzen. 2019. Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, 3428-3448. https://doi.org/10.18653/v1/P19-1334
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).
Benjamin Muller, Benoit Sagot, and Djamé Seddah. 2019. Enhancing BERT for Lexical Normalization. In Proceedings of the 5thWorkshop on Noisy User-generated Text (W-NUT 2019). Association for Computational Linguistics, Hong Kong, China, 297-306. https://doi.org/10.18653/v1/D19-5539
Timothy Niven and Hung-Yu Kao. 2019. Probing Neural Network Comprehension of Natural Language Arguments. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, 4658-4664. https://doi.org/10.18653/v1/P19-1459
Yifan Peng, Qingyu Chen, and Zhiyong Lu. 2020. An Empirical Study of Multi-Task Learning on BERT for Biomedical Text Mining. In Proceedings of the 19th SIGBioMed Workshop on Biomedical Language Processing. Association for Computational Linguistics, Online, 205-214. https://doi.org/10.18653/v1/2020.bionlp-1.22
Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. GloVe: Global Vectors for Word Representation. In Empirical Methods in Natural Language Processing (EMNLP). 1532-1543. http://www.aclweb.org/anthology/D14-1162
Alec Radford, Karthik Narasimhan, Time Salimans, and Ilya Sutskever. 2018. Improving language understanding with unsupervised learning. Technical report, OpenAI (2018).
Jason DM Rennie and Ryan Rifkin. 2001. Improving multiclass text classification with the support vector machine. Technical report, AIM-2001-026.2001 (2001).
Jason D. M. Rennie. 1999. Improving Multi-class Text Classification with Naive Bayes. Master?s thesis. Massachusetts Institute of Technology. http://qwone. com/~jason/papers/sm-Thesis.pdf
Reuters-TRC2. 2004. Thomson Reuters Text Research Collection. https://trec. nist.gov/data/reuters/reuters.html. Online; accessed January 2021.
Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf. 2019. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. In Proceedings of the 5th Workshop on Energy Efficient Machine Learning and Cognitive Computing (EMC2) co-located with the Thirty-Third Conference on Neural Information Processing Systems (NeurIPS 2019). 1-5.
Timo Schick and Hinrich Schötze. 2020. Rare Words: A Major Problem for Contextualized Embeddings and How to Fix it by Attentive Mimicking. In The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7-12, 2020. AAAI Press, 8766-8774. https://aaai.org/ojs/index.php/AAAI/article/view/6403
Emma Strubell, Ananya Ganesh, and Andrew McCallum. 2019. Energy and Policy Considerations for Deep Learning in NLP. In Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28-August 2, 2019, Volume 1: Long Papers, Anna Korhonen, David R. Traum, and Lluís Màrquez (Eds.). Association for Computational Linguistics, 3645-3650. https://doi.org/10.18653/v1/p19-1355
Chi Sun, Xipeng Qiu, Yige Xu, and Xuanjing Huang. 2019. How to fine-Tune BERT for text classification?. In China National Conference on Chinese Computational Linguistics. Springer, Springer International Publishing, 194-206.
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems. 5998-6008.
AlexWang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel R Bowman. 2019. Glue: A multi-Task benchmark and analysis platform for natural language understanding. In Proceedings of ICLR.
ThomasWolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, R?emi Louf, Morgan Funtowicz, and Jamie Brew. 2019. HuggingFace?s Transformers: State-of-The-Art Natural Language Processing. ArXiv abs/1910.03771 (2019).
Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, et al. 2016. Google?s neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144 (2016).
Baoxun Xu, Xiufeng Guo, Yunming Ye, and Jiefeng Cheng. 2012. An Improved Random Forest Classifier for Text Categorization. JCP 7 (2012), 2913-2920.
Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Russ R Salakhutdinov, and Quoc V Le. 2019. Xlnet: Generalized autoregressive pretraining for language understanding. In Advances in neural information processing systems. 5754-5764.
Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, and Eduard Hovy. 2016. Hierarchical attention networks for document classification. In Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies. 1480-1489.
Chin Man Yeung. 2019. Effects of inserting domain vocabulary and fine-Tuning BERT for German legal language. Master?s thesis. University of Twente. http: //essay.utwente.nl/80128/
Shanshan Yu, Jindian Su, and Da Luo. 2019. Improving BERT-based text classification with auxiliary sentence and domain knowledge. IEEE Access 7 (2019), 176600-176612.
Matt Zames. 2016. 2016 Letter To JP Morgan Shareholders, 2016 Annual Report. https://www.jpmorganchase.com/corporate/investor-relations/document/ar2016-lettertoshareholders.pdf.
H. Zhang and D. Li. 2007. Naïve Bayes Text Classifier. In 2007 IEEE International Conference on Granular Computing (GRC 2007). 708-708.