References of "Data"
     in
Bookmark and Share    
Full Text
Peer Reviewed
See detailUnified, Labeled, and Semi-Structured Database of Pre-Processed Mexican Laws
Martinez-Seis, Bella; Pichardo-Lagunas, Obdulia; Koff, Harlan UL et al

in Data (2022), 7(91),

This paper presents a corpus of pre-processed Mexican laws for computational tasks. The main contributions are the proposed JSON structure and the methodology used to achieve the semi-structured corpus ... [more ▼]

This paper presents a corpus of pre-processed Mexican laws for computational tasks. The main contributions are the proposed JSON structure and the methodology used to achieve the semi-structured corpus with the selected algorithms. Law PDF documents were transformed into plain text, unified by a deconstruction of law–document structure, and labeled with natural language processing techniques considering part of speech (PoS); a process of entity extraction was also performed. The corpus includes the Mexican constitution and the Mexican laws that were collected from the official site in PDF format repealed before 14 October 2021. The collection has 305 documents, including: the Mexican constitution, 289 laws, 8 federal codes, 3 regulations, 2 statutes, 1 decree, and 1 ordinance. The semi-structured database includes the transformation of the set of laws from PDF format to a digital representation in order to facilitate its computational analysis. The documents were migrated to JSON type files to represent internal hierarchical relations. In addition, basic natural language processing techniques were implemented on laws for the identification of part of speech and named entities. The presented data set is mainly useful for text analysis and data science. It could be used for various legislative analysis tasks including: comprehension, interpretation, translation, classification, accessibility, coherence, and searches. Finally, we present some statistic of the identified entities and an example of the usefulness of the corpus for environmental laws. [less ▲]

Detailed reference viewed: 33 (0 UL)
Full Text
Peer Reviewed
See detailLuxembourg Fund Data Repository
Skoura, Angeliki; Presber, Julian; Schiltz, Jang UL

in Data (2020), 5(3), 1-15

In this paper, we introduce the Luxembourg Fund Data Repository, a novel database of investment funds available for academic research that was created at the Department of Finance of the University of ... [more ▼]

In this paper, we introduce the Luxembourg Fund Data Repository, a novel database of investment funds available for academic research that was created at the Department of Finance of the University of Luxembourg. The database contains the population of Undertakings for Collective Investment in Transferable Securities funds domiciled in Luxembourg from the starting month of their existence (March 1988) to October 2016. The fund characteristics are organized in a comprehensive database architecture encompassing static and dynamic data over the entire life of the funds. The characteristics include fund identifiers, official name, status information, management company and other service providers, daily and monthly performance time-series, portfolio holdings, classification of investment objective, fees, dividends, and cash flows. The database was constructed after collecting and assembling complementary historical information from three data providers. Importantly, funds no longer in existence due to liquidation or mergers are included in the database, preventing survivorship bias. The database has been constructed to serve as a research dataset of high accuracy due to the maximization of population coverage, the maximization of historical coverage, and validation by using information acquired from the supervisory authority of the financial sector of Luxembourg. License currently available to researchers of the Department of Finance of the University of Luxembourg. Future plans for extending accessibility to the global academic community. [less ▲]

Detailed reference viewed: 98 (3 UL)