-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
compare biterm topic modelling to rainette, LDA, coclustering, structural topic model, embedding clustering, autoencoders #9
Comments
I have never used (or even taken a look at) this dataset before, but it maybe interesting: https://registry.opendata.aws/amazon-reviews/ |
Interesting and huge dataset, but unfortunately the license of that data is too restrictive. |
You are right. |
Would prefer to use data which can be shared |
Sorry for not checking before giving the link. |
I have not worked with short texts. Therefore, I have no good sources at hand, unfortunately. Maybe Japanese Haiku to make Text Mining more philosophical ;-)?
Side Note: sorry for not having worked on the quality metrics yet, too many other non-R-related projects, will keep it on my list, for the time being, text2vec::coherence might be used.
Am 26. Juni 2019 09:46:52 MESZ schrieb jwijffels <[email protected]>:
…Looking for some typical open data with short texts which are
interesting, in order to compare clustering methods (BTM / LDA / stm /
coclustering / reinert text clustering / embedding clustering /
autoencoder)
@datasculptor / @manuelbickel you know any interesting open data?
--
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#9
--
sent via mobile - please excuse typos
|
No problem. |
Could this be interesting? https://www.linkedin.com/feed/update/urn:li:activity:6553904839447973888 |
I'm sure you are a fan :) |
Also this one could be interesting: https://github.com/EmilHvitfeldt/textdata |
You could look at manifestos. manifestoR is an API to coded political text in several languages. |
Interesting. Didn't know these political party manifesto's existed. |
Looking for some typical open data with short texts which are interesting, in order to compare clustering methods (BTM / LDA / stm / coclustering / reinert text clustering / embedding clustering / autoencoder)
@datasculptor / @manuelbickel you know any interesting open data?
The text was updated successfully, but these errors were encountered: