1.7M+ Research Papers of ArXiv are Now Accessible on Kaggle

Aug 07, 2020 | AS Team

1.7M+ Research Papers of ArXiv are Now Accessible on Kaggle title banner

arXiv revealed, on August 05, 2020, that its entire research papers are made available on Kaggle through a tweet with an added comment “We can't wait to see what the machine learning community will do with it!” It helps in making the world’s biggest free scientific paper repository more reachable.

 

As a booklet of Physics archive and launched by Paul Ginsparg in 1991, arXiv is entertained by Cornell University that has grown as a necessary and vital platform for providing free and open reach to research papers for the machine learning and computer science enthusiast and beyond that. 

 

The new concert of arXiv and Kaggle, the world’s biggest data science association, is providing a free and open channel for the machine-readable arXiv dataset of round 1.7 million papers. (Source

 

arXiv executive director Eleonora Presani told in a press release, “by allowing the dataset on Kaggle, we go ahead what people can grasp through reading these articles completely. Also, we proffer the data and information behind arXiv, accessible to the public in a machine-readable format.”

 

The Kaggle dataset shows the actual arXiv research paper with the following entries;

 

  • id: ArXiv ID that can be chosen to obtain the paper.

  • submitter: Who has submitted the paper.

  • authors: Authors of the paper.

  • title: Title of the paper.

  • comments: Extra information like the number of pages and figures.

  • journal-ref: Information regarding the journal of the paper where the paper was published.

  • DOI: Digital Object Recognition.

  • abstract: The main abstract of the paper.

  • categories: Categories/tags in the arXiv system.

  • versions: A version antiquity.

 

Kaggle is the most promising platform for data scientists and machine learning ninjas who are interested in datasets, public journals, information on competitions, etc. Researchers can employ extreme data exploration tools of Kaggle in order to distribute appropriate scripts and outcomes with others.

Tags #Machine learning
Advertisement