Skip to content

WIP: Add a .shiny submodule (do not merge) #77

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

phobson
Copy link
Contributor

@phobson phobson commented Mar 5, 2025

Summary

Here's the basic start on adding the .shiny submodule. Locally, I've created a little app similar (but better) to the one I posted in Discord.

There's still a lot to do (namely tests and docs). The basic premise of the approach is this:

  • Provide shiny "modules" so that users can easily plug pointblank's GT output into a shiny app
  • User inputs to the server module include:
    • A reactive value/calc (callable) that returns a pb.Validate object. We pointblank will call .interogate() on it
    • An optional reactive value/calc (callable) that returns a GT instance to preview the data the way the user might want. If this isn't provided, we build a simple one ourselves:
@module.server
def pb_server(
   user_in: Inputs,
   output: Outputs,
   session: Session,
   validator: Callable[[], Validate],
   display_table: Callable[[], GT] | None = None,
):
   ...

   @render_gt
   def data_table() -> GT:
       if display_table is not None:
           return display_table()
       else:
           return (
               GT(validation().data.head(10))
               .tab_style(style.text(weight="bold"), loc.column_header())
               .opt_stylize(style=1)
           )

I've included a brief app that demonstrates usage.

Questions

  1. I've never worked with Quarto and haven't yet tried to build the docs for pointblank. Will a Quarto doc with server: shiny work with the doc builds? More to the point, how would you like this documented?
  2. For the example, do you have a preference for working with pandas over polars? Narwhals? I know the narwhals/great_tables/pointblank ecosystem is actively evolving. Looking ahead, what would you like to see here?
  3. I'm sure I'll have more.

Related GitHub Issues and PRs

None. Working based on this:
https://discord.com/channels/1345877328982446110/1346204992226197627/1346257619396067400

Checklist

  • I understand and agree to the Code of Conduct.
  • I have followed the Style Guide for Python Code as best as possible for the submitted code.
  • I have added pytest unit tests for any new functionality.
  • I've written a .qmd example show casing the module's usage

image

Example App

Click to see app source
from great_tables import GT, data, loc, style
from shiny import App, reactive, ui

import pointblank as pb
from pointblank.shiny import pb_server, pb_ui

CARS_ID = "cars_validation"


app_ui = ui.page_fluid(
    ui.input_slider("num_rows", "Rows in Preview (GT Cars only)", 5, 50, 10, step=5),
    ui.navset_tab(
        ui.nav_panel(
            ui.h4("GT Cars Data"),
            ui.input_select("select_drivetrain", "Select Drivetrain Type", ["rwd", "awd"]),
            pb_ui(CARS_ID, title="GT Cars Data Validation"),
        ),
    ),
)


def server(user_in, output, session):
    @reactive.calc
    def car_data():
        drivetrain = user_in.select_drivetrain()
        return data.gtcars.assign(hp=lambda df: df["hp"]).loc[
            lambda df: df["drivetrain"] == drivetrain
        ]

    @reactive.calc
    def cars_validator() -> pb.Validate:
        schema = pb.Schema(
            columns=[
                ("mfr", "object"),
                ("model", "object"),
                ("year", "int64"),
                ("trim", "object"),
                ("bdy_style", "object"),
                ("hp", "float64"),
                ("hp_rpm", "float64"),
                ("trq", "float64"),
                ("trq_rpm", "float64"),
                ("mpg_c", "float64"),
                ("mpg_h", "float64"),
                ("drivetrain", "object"),
                ("trsmn", "object"),
                ("ctry_origin", "object"),
                ("msrp", "float64"),
            ]
        )
        valid = (
            pb.Validate(car_data(), thresholds=(0, 0, 0), label="Cars Validation")
            .col_schema_match(schema, complete=True)
            .col_vals_ge("hp", 0)
            .col_vals_ge("hp_rpm", 5500)
            .col_vals_in_set("bdy_style", ["coupe", "sedan", "convertible"])
            .col_vals_in_set("ctry_origin", ["Germany", "Italy", "United Kingdom", "United States"])
            .interrogate()
        )
        return valid

    @reactive.calc
    def cars_table() -> GT:
        return (
            GT(car_data().head(user_in.num_rows()).reset_index())
            .tab_style(style.text(weight="bold"), loc.column_header())
            .cols_hide("index")
            .fmt_currency("msrp", use_subunits=False)
            .cols_label(
                {
                    "mfr": "Make",
                    "model": "Model",
                    "year": "Year",
                    "trim": "Package",
                    "bdy_style": "Body",
                    "drivetrain": "Drivetrain",
                    "trsmn": "Transmission",
                    "ctry_origin": "Country",
                    "msrp": "MSRP",
                    "hp": "Power ({{hp}})",
                    "hp_rpm": "Engine Speed @ Max. Power ({{rev / min}})",
                    "trq": "Torque ({{ft-lbs}})",
                    "trq_rpm": "Engine Speed @ Max. Torque ({{rev / min}})",
                    "mpg_c": "City",
                    "mpg_h": "Highway",
                }
            )
            .opt_stylize()
        )

    _ = pb_server(
        CARS_ID,
        validator=cars_validator  # reactive calc that defines the validation,
        display_table=cars_table  # optional reactive calc that returns a GT of the dataset
)


app = App(app_ui, server)

@phobson phobson marked this pull request as draft March 5, 2025 06:01
@codecov-commenter
Copy link

Codecov Report

Attention: Patch coverage is 0% with 47 lines in your changes missing coverage. Please review.

Project coverage is 97.57%. Comparing base (ffe7a15) to head (5c932ee).
Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
pointblank/shiny.py 0.00% 47 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #77      +/-   ##
==========================================
- Coverage   99.00%   97.57%   -1.43%     
==========================================
  Files          16       17       +1     
  Lines        3208     3255      +47     
==========================================
  Hits         3176     3176              
- Misses         32       79      +47     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@rich-iannone
Copy link
Member

rich-iannone commented Mar 5, 2025

Thanks for contributing! I'll try to answer your questions here

  1. I've never worked with Quarto and haven't yet tried to build the docs for pointblank. Will a Quarto doc with server: shiny work with the doc builds? More to the point, how would you like this documented?

I haven't tried it either so I recommend giving it a try locally. If you wanted to build the docs on your machine, run make docs-build at the project root. Just make sure you have the quarto cli utility installed (with a fairly recent version).

And I'm actually not sure where the docs would go in this case. One option is to just create an article that would go in the user guide. It would be great to document the module in the API reference but it's currently unclear how that could be done.

  1. For the example, do you have a preference for working with pandas over polars? Narwhals? I know the narwhals/great_tables/pointblank ecosystem is actively evolving. Looking ahead, what would you like to see here?

I have a preference for Polars over Pandas but when I write examples for the User Guide, I try to include example tables that are a mix of Polars, Pandas, or DuckDB (via Ibis). What I'm trying to actively show is that Pointblank generally works well with Narwhals and Ibis backend tables.

  1. I'm sure I'll have more.

Ask me questions anytime! I may not have all the answers but any sort of discussion might help nudge you in the right direction.

Hope this helps a bit. This is all new ground so feel free to experiment and change things up as needed!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants