Kaggle doesn’t have comprehensive data for this kind of project. Using Open Alex - its an open source and comprehensive catalog of scholarly papers and other things

has a nice api I think I can just add to my code also good documentation

Welcome!

since we’re trying to reproduce the previous study → we wont be able to do it as well but that’s okay

Boolean query → API call for open alex

result: data is in convenient format yay

extra: we can explore with the data by plotting things like the number of publications we searched by the number of publications in each year, or even the number of publications with a specific commonly used keyword. Numeric data is the best for plotting, so the publication_year data would work best for this situation