Collecting, transforming and cleaning JSTOR metadata in Python

By Medium - 2021-03-16

Description

JSTOR database is one of the leading sources of research articles in more than 50 disciplines of science. In Data for Research section, researchers can access datasets for use in research and…

Summary

A simple guide into parsing meta-data from JSTOR data for research database using the ElementTree XML.
JSTOR database is one of the leading sources of research articles in more than 50 disciplines of science.
To make accessing larger volumes of data for data scientists and researchers easier, in this article, I show the python code for parsing the XML outputs, explain the process of collecting the data from JSTOR data for research database, and show a nice application of this type of data.
It follows the hierarchical structures of XML files and ignores Book reviews and notices.

Topics

Backend (0.4)
Database (0.23)
Machine_Learning (0.16)

Similar Articles

Drowning in Data? How To Ensure Your Data Strategy Isn't Hurting Your Brand?

By CMSWire.com - 2021-03-16

Not all data is valuable or actionable and discerning which is which can be hard. Learn to craft a successful data strategy that can help a brand learn to swim.

Big data architecture style - Azure Application Architecture Guide

By Docs - 2021-01-24

Describes benefits, challenges, and best practices for Big Data architectures on Azure.

The Growing Importance of Metadata Management Systems

By Gradient Flow - 2021-02-02

Metadata will be the foundation for data governance solutions, data catalogs, and other enterprise data systems. By Assaf Araki and Ben Lorica. Introduction As companies embrace digital technologie…

Data Science Minimum: 10 Essential Skills You Need to Know to Start Doing Data Science

By KDnuggets - 2020-10-13

Data science is ever-evolving, so mastering its foundational technical and soft skills will help you be successful in a career as a Data Scientist, as well as pursue advance concepts, such as deep lea ...

DS 101: Machine Learning & Modelling in Theory & Practice

By Ai+ Training - 2021-01-26

Join AI+ Subscription to learn at ODSC Training about Machine Learning & Modelling

Learning Data Science From the Perspective of a Proficient Developer

By Medium - 2020-12-08

As you know, data science, and more specifically machine learning, is very much en vogue now, so guess what? I decided to enroll in a MOOC to become fluent in data science. But when you start with a…

Feedback

Let us know how do you think about this newsletter or want to add new topics or keywords

contact@velasticity.com

Bookmarks

Latest Readings in NLP

By Medium - 2021-03-16

Stunning Tables using bokeh and svg

By KDnuggets - 2021-03-16

Natural Language Processing Pipelines, Explained

By KDnuggets - 2021-03-16

2019 Best Masters in Data Science and Analytics – Online

By Medium - 2021-03-13

Storage & Compute for Machine Learning

By congress - 2021-03-15

Text - H.R.1019 - 117th Congress (2021-2022): E-BIKE Act

By KDnuggets - 2021-03-16

Top 10 Best Podcasts on AI, Analytics, Data Science, Machine Learning

By datasciencecentral - 2021-03-16

How to become a Digital Strategy Leader

By datasciencecentral - 2021-03-16

All about Use Of Data Science

By Medium - 2021-02-15

10 Hyper-parameter Tuning Libraries

By GitHub - 2021-03-16

doc.noun_chunks is not supported for Chinese language, how to figure this out? · Issue #7436 · explosion/spaCy

By Medium - 2021-03-15

Gaussian Process Regression From First Principles

By datasciencecentral - 2021-03-16

5 tasks You Can Automate in Business Intelligence (BI) and Analytics

By GitHub - 2021-03-15

Avoiding accidental errors with sanity checks · Discussion #5053 · allenai/allennlp

By KDnuggets - 2021-03-14

Emotion and Sentiment Analysis: A Practitioner’s Guide to NLP

By Medium - 2021-03-16

How Data Science Can Give Further Understanding on Urban Poverty

By Electronic Frontier Foundation - 2021-03-03

Google’s FLoC Is a Terrible Idea

By Medium - 2021-03-13

Spark it up a notch. Nitty-gritty details of Apache Spark

By datasciencecentral - 2021-03-16

7 Key Benefits of Integrating Asset Monitoring in the Water Sector

By Medium - 2021-03-16

Why Machines Will Never Feel Empathy: A Q&A With MIT’s Sherry Turkle

By datasciencecentral - 2021-03-16

Clustering with Scikit with GIFs

By Coursera - 2021-03-14

Numerical Methods for Engineers

By Medium - 2021-03-16

Introduction to Bootstrapping in Data Science — part

By SearchEnterpriseAI - 2021-03-15

The power and limitations of enterprise AI

By GitHub - 2021-03-14

aajanki/spacy-fi

By Selbstmanagement - 2021-03-14

By KDnuggets - 2021-03-14

Feature Store as a Foundation for Machine Learning

By Medium - 2021-03-11

Lowri Williams on How to Connect Your Academic Training to Real-World Challenges

By huggingface - 2021-03-15

elgeish/wav2vec2-large-xlsr-53-arabic · Hugging Face

By Medium - 2021-03-16

The All-time Best Guides to Data Science Writing

By datasciencecentral - 2021-03-16

Can Gerrymandering Be Ended via Machine Learning?

By KDnuggets - 2021-03-14

Naïve Bayes Algorithm: Everything you need to know

By Medium - 2021-03-09

Weekly Awesome Tricks And Best Practices From Kaggle | Towards Dev

By huggingface - 2021-03-16

Hugging Face – On a mission to solve NLP, one commit at a time.

By datasciencecentral - 2021-03-16

Google is Rethinking its Business – What About You?

By datasciencecentral - 2021-03-16

Media and Entertainment: How This Industry is Impacted by Big Data

By KDnuggets - 2021-03-14

Introduction to Data Engineering

By KDnuggets - 2021-03-16

Metric Matters, Part 1: Evaluating Classification Models

By datasciencecentral - 2021-03-16

Markov Chain Monte Carlo Methods for Bayesian Data Analysis in Astronomy

By datasciencecentral - 2021-03-15

Data Analytics Perks