sum_to_zero_vector case study #229

mitzimorris · 2025-03-03T16:01:45Z

Transferring the contents of github repo: https://github.com/mitzimorris/sum_to_zero_vector to this repo.

This case study introduces the sum_to_zero_vector. It demonstrates a simple workflow for evaluating performance of different ways to impose a sum-to-zero constraint on a parameter vector.

The HTML file is self-contained. To re-render the HTML, this requires the stan-dev/quarto-config repo for the Stan website styling.

mitzimorris · 2025-03-03T16:08:33Z

hi @spinkney and @WardBrian - not sure if we need reviews to add case studies, but I would appreciate any feedback you might have, if you have time.

WardBrian

I only have one real comment, otherwise this looks great!

Will you also open a PR to add the rendered version to the website?

jupyter/sum-to-zero/sum_to_zero_evaluation.qmd

spinkney

I mostly added comments that hopefully helps the reader to see the differences and the results more clearly. I'm happy that you put this together and I'm happy to chat about any of the comments if you'd like.

jupyter/sum-to-zero/sum_to_zero_evaluation.qmd

spinkney · 2025-03-06T15:06:16Z

jupyter/sum-to-zero/sum_to_zero_evaluation.qmd

+```
+
+**Eth**
+


What is this? What does all the code do? Don't assume your readers know python that well!

@spinkney: What are you asking for here? Documentation within the code? A description summarizing what it does on the outside?

I would assume the readers know Python pretty well. Otherwise, the intro to Python will overwhelm the intro to sum-to-zero. If you do want to teach people bits of Python, you should have a model reader in mind or it will be very hard to decide where to stop with expounding on the Python code. Is that someone who knows Python syntax but not any libraries like numpy or matplotlib? Or someone even less skilled who doesn't understand python scoping or control flow? Maybe they don't know what += means or that range(N) is from 0 to N-1?

jupyter/sum-to-zero/sum_to_zero_evaluation.qmd

mitzimorris · 2025-04-07T03:43:11Z

rewrote the case study per @spinkney comments - which were absolutely spot on!

the essential comparisons are presented as tables
data processing is summarized, but not shown

should I omit the ipynb or update to match?

bob-carpenter · 2025-04-07T19:52:59Z

Because the data-generating parameters and percentage of observations per category are generated at random,
some datasets may have many low count or no count strata, or random effects near zero across several categories
and will therefore be pathologically hard to fit. Using a specified seed for the data-generating program avoids this problem.

This is basically saying that your model fails SBC as specified. One way to fix it would be to (a) tighten the priors so they're not so extreme, and (b) avoid simulating empty strata by generating the fill level uniformly in (0.5, 1) rather than uniformly in (0, 1). That would probably be preferable than a disclaimer that says SBC fails.

I wouldn't say "data-generating parameters"---the parameters are arguments to the distributions that generate data, but they don't generate data themselves. I would also prefer "zero count" to "no count". I would also prefer "varying effects" to "random effects" for all the reasons Andrew mentions.

How about:

Because the parameters are simulated randomly from their priors and the number of observations per category are also simulated randomly, some data sets may have many low or zero count strata. This will lead to varying effect parameters with near-zero values, which are on the boundary of parameter space, and hence very challenging to fit. Cherry-picking a seed that works avoids these issues.

mitzimorris · 2025-04-09T08:54:52Z

@spinkney, @bob-carpenter - ready for review

mitzimorris added 3 commits August 5, 2024 14:54

Merge branch 'master' of https://github.com/stan-dev/example-models

85cdd5d

update README, LICENSE

049d7ba

case study, from mitzimorris github repo

039bec4

mitzimorris requested review from WardBrian and spinkney March 3, 2025 16:03

WardBrian approved these changes Mar 5, 2025

View reviewed changes

jupyter/sum-to-zero/sum_to_zero_evaluation.qmd Outdated Show resolved Hide resolved

jupyter/sum-to-zero/sum_to_zero_evaluation.qmd Outdated Show resolved Hide resolved

changes per review

e367d4c

spinkney requested changes Mar 6, 2025

View reviewed changes

mitzimorris added 2 commits April 6, 2025 23:32

revised case study

f880cad

helper functions

0675d82

mitzimorris added 4 commits April 7, 2025 16:06

changes per code review

7022770

checkpointing - addressing reviewer comments

311f112

checkpointing - addressing reviewer comments

bd6c321

better tables

c421fe0

mitzimorris added 3 commits April 9, 2025 04:59

cleanup

47182b0

cleanup

5b1b515

cleanup

ebc336c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sum_to_zero_vector case study #229

sum_to_zero_vector case study #229

mitzimorris commented Mar 3, 2025

mitzimorris commented Mar 3, 2025

WardBrian left a comment

spinkney left a comment

spinkney Mar 6, 2025

bob-carpenter Apr 7, 2025

mitzimorris commented Apr 7, 2025

bob-carpenter commented Apr 7, 2025

mitzimorris commented Apr 9, 2025

		```

		Eth

sum_to_zero_vector case study #229

Are you sure you want to change the base?

sum_to_zero_vector case study #229

Conversation

mitzimorris commented Mar 3, 2025

mitzimorris commented Mar 3, 2025

WardBrian left a comment

Choose a reason for hiding this comment

spinkney left a comment

Choose a reason for hiding this comment

spinkney Mar 6, 2025

Choose a reason for hiding this comment

bob-carpenter Apr 7, 2025

Choose a reason for hiding this comment

mitzimorris commented Apr 7, 2025

bob-carpenter commented Apr 7, 2025

mitzimorris commented Apr 9, 2025