Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Visualize hero.describe with plots #178

Open
wants to merge 25 commits into
base: master
Choose a base branch
from

Conversation

mk2510
Copy link
Collaborator

@mk2510 mk2510 commented Sep 6, 2020

This new function in the visualization module allows users to easily visualize the describe function. We belive, that this will be often use, to display the key characteristics of the DataFrame. 🥉

This function simply calls the describe functions and then with plotly those plots will be generated and displayed in a new browser window. As it is a lot of code to display them, we belive, that it will be quite an enrichment to the library 🏑

Note: travis/setup changed because of #171 . This is branched straight from PR #168 which is why it has so many lines changed 🥡 .

This is a short colab demo of the new function 🤖

Max stinkt

mk2510 and others added 22 commits August 18, 2020 22:06
suport MultiIndex as function parameter

returns MultiIndex, where Representation was returned

* missing: correct test


Co-authored-by: Henri Froese <[email protected]>
*missing: test adopting for new types


Co-authored-by: Henri Froese <[email protected]>
- add functionality for decorator @InputSeries to handle several allowed input types
- Add typing decorator/hints to representation.py
- add tests for _types DocumentTermDF

Co-authored-by: Maximilian Krahn <[email protected]>
Co-authored-by: Maximilian Krahm <[email protected]>
Co-authored-by: Henri Froese <[email protected]>
Co-authored-by: Henri Froese <[email protected]>
@henrifroese henrifroese changed the title Visualize describe with plots Visualize hero.describe with plots Sep 6, 2020
@jbesomi
Copy link
Owner

jbesomi commented Sep 7, 2020

That's beautiful! 😍

  • What does it happen if you run visualize_describe on an empty DF? (we should test that behavior for all heroes functions ...)
  • (opinionable comment) What if we have all graphs in a single column? That's might be better for visualization on narrow displays

texthero/visualization.py Outdated Show resolved Hide resolved
@henrifroese
Copy link
Collaborator

What does it happen if you run visualize_describe on an empty DF? (we should test that behavior for all heroes functions ...)

Currently we get an error but that's just because of the default arguments (see below). Agree we should test that behavior for all hero functions, probably in a separate PR in the future.

(opinionable comment) What if we have all graphs in a single column? That's might be better for visualization on narrow displays

We tried it out and think it looks good as-is on 13 inch screens (which is the standard "small" laptop size).

From your review

labels_col_name="topic" not sure it make sense to have as default value "topic". Moreover, there might be multiple non-text columns. ...

Yes, the default arguments really make no sense, we just used them for testing purposes and forgot to remove them 🤦 🤦‍♂️ 🤦‍♀️

Additionally, not sure what you mean / how we would integrate several text columns? We thought that there are two main scenarios:

  1. User has text in a column -> he uses our function without the labels stuff, everything's great

  2. User has text in a column and labels in a column (e.g. to train a classifier) -> he uses our function with the labels stuff, everything's great

Additionally just noticed that there isn't a docstring (totally forgot this). Will convert this to draft and address all concerns + add docstring. Will update here and post screenshots when it's done.

- Docstring
- Better signature
@henrifroese
Copy link
Collaborator

henrifroese commented Sep 9, 2020

Now changed the function signature to be in line with hero.describe (which really makes sense as this is a visualization of hero.describe). Additionally, default arguments are changed.

We noticed that this function cannot handle empty input because of hero.tokenize cannot handle empty input. This is probably a low-priority bug as of course empty input does not make any sense whatsoever.

Note: the merge conflicts should be resolved when we're farther along the merge checklist; we'll revisit this

@jbesomi jbesomi marked this pull request as draft September 14, 2020 15:59
@mk2510
Copy link
Collaborator Author

mk2510 commented Sep 22, 2020

we just went through this PR again and it is ready to be reviewed / to be merged 🔢 🧶

@mk2510 mk2510 marked this pull request as ready for review September 22, 2020 20:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants