The clusters are characterized by the predominance of the following terms in their abstracts

Keywords: Document clustering, Scientometrics.

Observing trends in academic journals often relies on a priori knowledge of subfields and topics found in publications.

This is a creation using D3 and datasets extracted from PubMed. It attempts to illustrate the dynamic fluctuation in publications (y) for various topic (color) of specific academic journals over the years. Trends are colored by their respective cluster of topics.

The topics were extracted based on the Clustering of documents as a single corpus. A surprising amount of steps are needed to extract these trends as no a priori information on the content of these journals is assumed. A lot of the development thus rests on various forms of Natural Language Processing.

Interested to see how these data were extracted and analyzed? The code will soon be released on GitHub! Want to collaborate? Send an email my way. Think this was fun to play with? Head over to another one of my project!

Tools & Libraries: D3, Bootstrap Slider, SciPy, NLTK,