Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

openAlex new author and work data accuracy. #146

Open
yhan818 opened this issue Aug 14, 2023 · 0 comments
Open

openAlex new author and work data accuracy. #146

yhan818 opened this issue Aug 14, 2023 · 0 comments
Labels
question Further information is requested

Comments

@yhan818
Copy link
Contributor

yhan818 commented Aug 14, 2023

Hi, All,

The openAlex new author data was out on July 25, and was finally announced on Aug 11, 2023. I did some tests after July 25 and also after Aug 11. I noticed that there are some intermedia changes in the Author data.

It seems to me that it uses cosine_similarity to measure the similarity and XGBoost of matching an author with his/her name. see code https://github.com/ourresearch/openalex-name-disambiguation/tree/main/V3/002_Data_Processing_Modeling_Clustering

In general, the new author data is much better in terms of accuracy, compared to the previous version. However, there are still issues for both author and work. I have tested some cases. see yhan818/openalexR-test#7

So what are your views on the latest updates?

@trangdata trangdata added the question Further information is requested label Jul 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants