Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need. Advances in neural information processing systems, 30, 2017.
MelissaMTerras. The rise of digitization. Digitisation perspectives, pages 3-20, 2011.
S.A. South. Method and Theory in Historical Archeology. Institute for Research on Poverty Monograph Series. Academic Press, 1977.
Hans Peter Luhn. A statistical approach to mechanized encoding and searching of literary information. IBM Journal of research and development, 1(4):309-317, 1957.
Vishal Gupta and Gurpreet Singh Lehal. A survey of text summarization extractive techniques. Journal of emerging technologies in web intelligence, 2(3):258-268, 2010.
Rafael Ferreira, Luciano de Souza Cabral, Rafael Dueire Lins, Gabriel Pereira Silva, Fred Freitas, George DC Cavalcanti, Rinaldo Lima, Steven J Simske, and Luciano Favaro. Assessing sentence scoring techniques for extractive text summarization. Expert systems with applications, 40(14):5755-5764, 2013.
Salima Lamsiyah, Abdelkader El Mahdaouy, Said El Alaoui Ouatik, and Bernard Espinasse. Unsupervised extractive multi-document summarization method based on transfer learning from bert multi-task fine-tuning. Journal of Information Science, 49(1):164-182, 2023.
Ramesh Nallapati, Bowen Zhou, Cicero dos Santos, Çaǧlar Gulçehre, and Bing Xiang. Abstractive text summarization using sequence-to-sequence RNNs and beyond. In Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning, pages 280-290. Association for Computational Linguistics, 2016.
Hui Lin and Vincent Ng. Abstractive summarization: A survey of the state of the art. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pages 9815-9822, 2019.
Shuaiqi Liu, Jiannong Cao, Ruosong Yang, and ZhiyuanWen. Key phrase aware transformer for abstractive summarization. Information Processing & Management, 59(3):102913, 2022.
Yihong Gong and Xin Liu. Generic text summarization using relevance measure and latent semantic analysis. In Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pages 19-25, 2001.
Seeger Fisher and Brian Roark. Query-focused summarization by supervised sentence ranking and skewed word distributions. In Proceedings of the Document Understanding Conference, DUC-2006, New York, USA, 2006.
Ramesh Nallapati, Bowen Zhou, Cicero dos Santos, Çaǧlar Gulçehre, and Bing Xiang. Abstractive text summarization using sequence-to-sequence RNNs and beyond. In Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning, pages 280-290. Association for Computational Linguistics, 2016.
Arman Cohan, Franck Dernoncourt, Doo Soon Kim, Trung Bui, Seokhwan Kim, Walter Chang, and Nazli Goharian. A discourse-aware attention model for abstractive summarization of long documents. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), pages 615-621. Association for Computational Linguistics, 2018.
Yang Liu and Mirella Lapata. Text summarization with pretrained encoders. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3730-3740. Association for Computational Linguistics, 2019.
Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer. BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7871-7880. Association for Computational Linguistics, 2020.
Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1):5485-5551, 2020.
Jingqing Zhang, Yao Zhao, Mohammad Saleh, and Peter Liu. Pegasus: Pre-training with extracted gap-sentences for abstractive summarization. In International Conference on Machine Learning, pages 11328-11339. PMLR, 2020.
Arman Cohan, Franck Dernoncourt, Doo Soon Kim, Trung Bui, Seokhwan Kim, Walter Chang, and Nazli Goharian. A discourse-aware attention model for abstractive summarization of long documents. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), pages 615-621. Association for Computational Linguistics, June 2018.
Wenjun Qiu and Yang Xu. Histbert: A pre-trained language model for diachronic lexical semantic analysis. arXiv preprint arXiv:2202.03612, 2022.
Zara Nasar, Syed Waqar Jaffry, and Muhammad Kamran Malik. Textual keyword extraction and summarization: State-of-the-art. Information Processing & Management, 56(6):102088, 2019.
Wafaa S El-Kassas, Cherif R Salama, Ahmed A Rafea, and Hoda K Mohamed. Automatic text summarization: A comprehensive survey. Expert systems with applications, 165:113679, 2021.
Michael Piotrowski. Natural language processing for historical texts. Synthesis lectures on human language technologies, 5(2):1-157, 2012.
Maud Ehrmann, Ahmed Hamdi, Elvys Linhares Pontes, Matteo Romanello, and Antoine Doucet. Named entity recognition and classification on historical documents: A survey. arXiv preprint arXiv:2109.11406, 2021.
Thomas Hills and Alessandro Miani. A short primer on historical natural language processing.
Tianyi Zhang, Faisal Ladhak, Esin Durmus, Percy Liang, Kathleen McKeown, and Tatsunori B Hashimoto. Benchmarking large language models for news summarization. Transactions of the Association for Computational Linguistics, 12:39-57, 2024.
Hanlei Jin, Yang Zhang, Dan Meng, Jun Wang, and Jinghua Tan. A comprehensive survey on process-oriented automatic text summarization with exploration of llm-based methods. arXiv preprint arXiv:2403.02901, 2024.
Jennifer Holt and Alisa Perren. Media industries: History, theory, and method. John Wiley & Sons, 2011.
Stephan Oepen, Kristin Hagen, and Janne Bondi Johannessen, editors. Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013), Oslo, Norway, May 2013. Linköping University Electronic Press, Sweden.
Gerlof Bouma and Yvonne Adesam, editors. Proceedings of the NoDaLiDa 2017 Workshop on Processing Historical Language, Gothenburg, May 2017. Linköping University Electronic Press.
Taraka Rama. Studies in computational historical linguistics: Models and analyses. 2015.
Eva Pettersson, Jonas Lindström, Benny Jacobsson, and Rosemarie Fiebranz. Histsearch-implementation and evaluation of a web-based tool for automatic information extraction from historical text. In HistoInformatics@ DH, pages 25-36, 2016.
Marcel Bollmann, Anders Søgaard, and Joachim Bingel. Multi-task learning for historical text normalization: Size matters. In Proceedings of the Workshop on Deep Learning Approaches for Low-Resource NLP, pages 19-24, 2018.
Erik Tjong Kim Sang, Marcel Bollmann, Remko Boschker, Francisco Casacuberta, Feike Dietz, Stefanie Dipper, Miguel Domingo, Rob van der Goot, Marjo van Koppen, Nikola Ljubeŝi, et al. The clin27 shared task: Translating historical text to contemporary language for improving automatic linguistic annotation. Computational Linguistics in the Netherlands Journal, 7:53-64, 2017.
Yi Yang and Jacob Eisenstein. Part-of-speech tagging for historical English. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1318-1328. Association for Computational Linguistics, 2016.
Baptiste Blouin, Benoit Favre, Jeremy Auguste, and Christian Henriot. Transferring modern named entity recognition to the historical domain: How to take the step? In Proceedings of the Workshop on Natural Language Processing for Digital Humanities, pages 152-162. NLP Association of India (NLPAI), 2021.
William L. Hamilton, Kevin Clark, Jure Leskovec, and Dan Jurafsky. Inducing domain-specific sentiment lexicons from unlabeled corpora. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 595-605, Austin, Texas, November 2016. Association for Computational Linguistics.
Rachele Sprugnoli and Sara Tonelli. Novel event detection and classification for historical texts. Computational Linguistics, 45(2):229-265, 2019.
Viet Dac Lai, Minh Van Nguyen, Heidi Kaufman, and Thien Huu Nguyen. Event extraction from historical texts: A new dataset for black rebellions. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 2390-2400, 2021.
Shmuel Liebeskind and Chaya Liebeskind. Deep learning for period classification of historical hebrew texts. Journal of Data Mining & Digital Humanities, 2020, 2020.
James Gung and Jugal Kalita. Summarization of historical articles using temporal event clustering. In Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 631-635, 2012.
Partha Protim Ghosh, Rezvi Shahariar, and Muhammad Asif Hossain Khan. A rule based extractive text summarization technique for bangla news documents. International Journal of Modern Education and Computer Science, 10(12):44, 2018.
Chin-Yew Lin. ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out, pages 74-81. Association for Computational Linguistics, July 2004.
Xutan Peng, Yi Zheng, Chenghua Lin, and Advaith Siddharthan. Summarising historical text in modern languages. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 3123-3142, Online, April 2021. Association for Computational Linguistics.
Yongping Du, Qingxiao Li, Lulin Wang, and Yanqing He. Biomedicaldomain pre-trained language model for extractive summarization. Knowledge-Based Systems, 199:105964, 2020.
Alexander M. Rush, Sumit Chopra, and Jason Weston. A neural attention model for abstractive sentence summarization. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 379-389. Association for Computational Linguistics, 2015.
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171-4186. Association for Computational Linguistics, 2019.
Nils Reimers and Iryna Gurevych. Sentence-BERT: Sentence embeddings using Siamese BERT-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3982-3992, 2019.
Liu Zhuang, Lin Wayne, Shi Ya, and Zhao Jun. A robustly optimized BERT pre-training approach with post-training. In Proceedings of the 20th Chinese National Conference on Computational Linguistics, pages 1218-1227, 2021.
Abhay Shukla, Paheli Bhattacharya, Soham Poddar, Rajdeep Mukherjee, Kripabandhu Ghosh, Pawan Goyal, and Saptarshi Ghosh. Legal case document summarization: Extractive and abstractive methods and their evaluation. In Yulan He, Heng Ji, Sujian Li, Yang Liu, and Chua-Hui Chang, editors, Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 1048-1064, Online only, November 2022. Association for Computational Linguistics.
Diego de Vargas Feijo and Viviane P. Moreira. Improving abstractive summarization of legal rulings through textual entailment. Artif. Intell. Law, 31(1):91-113, November 2021.
Iz Beltagy, Matthew E. Peters, and Arman Cohan. Longformer: The longdocument transformer. ArXiv, abs/2004.05150, 2020.
Alex Aguiar Lins, Cecilia Silvestre Carvalho, Francisco Das Chagas Jucá Bomfim, Daniel de Carvalho Bentes, and Vládia Pinheiro. CLSJUR.BR-a model for abstractive summarization of legal documents in Portuguese language based on contrastive learning. In Pablo Gamallo, Daniela Claro, António Teixeira, Livy Real, Marcos Garcia, Hugo Gonçalo Oliveira, and Raquel Amaro, editors, Proceedings of the 16th International Conference on Computational Processing of Portuguese-Vol. 1, pages 321-331, Santiago de Compostela, Galicia/Spain, March 2024. Association for Computational Lingustics.
Lochan Basyal and Mihir Sanghvi. Text summarization using large language models: A comparative study of mpt-7b-instruct, falcon-7binstruct, and openai chat-gpt models. ArXiv, abs/2310.10449, 2023.
Hadi Askari, Anshuman Chhabra, Muhao Chen, and Prasant Mohapatra. Assessing llms for zero-shot abstractive summarization through the lens of relevance paraphrasing. ArXiv, abs/2406.03993, 2024.
Christopher D Manning, Mihai Surdeanu, John Bauer, Jenny Rose Finkel, Steven Bethard, and David McClosky. The stanford corenlp natural language processing toolkit. In Proceedings of 52nd annual meeting of the association for computational linguistics: system demonstrations, pages 55-60, 2014.
Ming Zhong, Pengfei Liu, Danqing Wang, Xipeng Qiu, and Xuanjing Huang. Searching for effective neural extractive summarization: What works and what's next. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 1049-1058, 2019.
Romain Paulus, Caiming Xiong, and Richard Socher. A deep reinforced model for abstractive summarization. 2018.
Maarten R. Grootendorst. Bertopic: Neural topic modeling with a classbased tf-idf procedure. ArXiv, abs/2203.05794, 2022.
Daniel Deutsch, Rotem Dror, and Dan Roth. A statistical analysis of summarization evaluation metrics using resampling methods. Transactions of the Association for Computational Linguistics, 9:1132-1146, 2021.
Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, et al. Google's neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144, 2016.
Shashi Narayan, Shay B. Cohen, and Mirella Lapata. Don't give me the details, just the summary! topic-aware convolutional neural networks for extreme summarization. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 1797-1807, 2018.
Imran Chamieh, Torsten Zesch, and Klaus Giebermann. LLMs in short answer scoring: Limitations and promise of zero-shot and few-shot approaches. In Ekaterina Kochmar, Marie Bexte, Jill Burstein, Andrea Horbach, Ronja Laarmann-Quante, Anaïs Tack, Victoria Yaneva, and Zheng Yuan, editors, Proceedings of the 19th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2024), pages 309-315, Mexico City, Mexico, June 2024. Association for Computational Linguistics.
Agam Shah and Sudheer Chava. Zero is not hero yet: Benchmarking zero-shot performance of llms for financial tasks. arXiv preprint arXiv:2305.16633, 2023.