Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make last budget year flexible #2846

Closed
wants to merge 48 commits into from
Closed

Conversation

jdebacker
Copy link
Member

This PR uses the new ParamTools package (0.19.0) to update the max year in the default parameter JSON files so that users can define their preferred last budget year. The default remains a 10 year budget window, but with certain data files (e.g., the TMD file) the user may wish to extend the budget window further.

@martinholmer martinholmer marked this pull request as ready for review December 6, 2024 20:54
@jdebacker
Copy link
Member Author

@martinholmer writes

The setup.py file is still out of sync with the environment.yml and conda.recipes/meta.yaml files.

Can you tell me what's not in sync? They look good to me and the test_4package.test_for_consistency() is passing.

@martinholmer
Copy link
Collaborator

@jdebacker asked:

Can you tell me what's not in sync?

As I said in this comment:

In particular, in the setup.py file, paramtools is specified as >=0.19.0 the same as it is in environment.yml, but numpy can be any version in setup.py even though it is required to be >=1.26 in environment.yml. Why the inconsistency?

And there are similar inconsistencies between the treatment of paramtools and other packages.

@jdebacker
Copy link
Member Author

And there are similar inconsistencies between the treatment of paramtools and other packages.

@martinholmer Except for ParamTools, we've not been specifying pins of the packages in setup.py even when they exist in environment.yml and and meta.yaml. The test_4package.test_for_consistency() ignores even checking for consistency with setup.py, which I inferred was because we didn't need such pins in that file. I only updated the Paramtools pin in setup.py since it was already there. So while we may consider some changes to pinning all or none of the install_requires packages in setup.py, this PR is consistent with current taxcalc conventions.

@martinholmer
Copy link
Collaborator

@jdebacker said:

Except for paramtools, we've not been specifying pins of the packages in setup.py even when they exist in environment.yml and meta.yaml. The test_4package.test_for_consistency() ignores even checking for consistency with setup.py, which I inferred was because we didn't need such pins in that file. I only updated the paramtools pin in setup.py since it was already there. So while we may consider some changes to pinning all or none of the install_requires packages in setup.py, this PR is consistent with current taxcalc conventions.

This explanation is not convincing.

First, let's delve into what exact are "current taxcalc conventions". The paramtools package was first used in taxcalc in May 2020 (see PR #2401) and there was no mention of paramtools in setup.py until you added a full list of package dependencies, including paramtools, in Feb 2024 (see in PR #2721), which was nearly four years after paramtools had been being used by taxcalc. It is not clear from the repository history why the full list of dependency packages was added. But what is clear is the setup.py list contained no minimum required versions for any of the packages:

Screenshot 2024-12-07 at 12 55 47 PM

Second, in Sep 2024 PR #2799 added a minimum required version number to just paramtools in the setup.py list of dependencies. And as far as I can see, there was no explanation for why the minimum required version was added to just paramtools in PR #2799.

Screenshot 2024-12-07 at 1 02 35 PM

So, we see that "current taxcalc conventions" date back to changes made only a few months ago.

After this review of setup.py history, I see no reason for including a minimum required version for paramtools in the setup.py file because the minimum required version for several of the dependencies (including paramtools) are already specified in both the environment.yml and conda.recipe/meta.yaml files.
So, instead of the following change in this PR:

        "numpy",
        "pandas",
        "bokeh", 
        "numba",
        "requests",
-       "paramtools>=0.18.3",
+       "paramtools>=0.19.0",
    ],

the change should be:

        "numpy",
        "pandas",
        "bokeh", 
        "numba",
        "requests",
-       "paramtools>=0.18.3",
+       "paramtools,
    ]

@jdebacker
Copy link
Member Author

@martinholmer Thanks for noting the lack of documentation in some changes to setup.py. These were made to fix issues (e.g., pip installing taxcalc in Google Colab, but then finding that paramtools was missing or of the wrong version), but I didn't give much thought to the best practices for the install_requires arguments in setup.py.

After reading some best practices, I think we should use the same version pins in setup.py as in enviornment.yml and meta.yaml. Test ensure these are kept in sync, I would propose adding setup.py to the test_4package.test_for_consistency() test. This will ensure that we are requiring the same versions of dependencies be installed for both the package on PyPI (which is specified in setup.py) and the package on Conda-Forge (which is specified in meta.yaml) -- as well as in the development environment (specified in environment.yml).

Does this sound good to you? Anything to add?

Once we agree, I can add those changes to this PR.

@martinholmer
Copy link
Collaborator

@jdebacker said:

After reading some best practices, I think we should use the same version pins in setup.py as in enviornment.yml and meta.yaml. [In order to] ensure these are kept in sync, I would propose adding setup.py to the test_4package.test_for_consistency() test. This will ensure that we are requiring the same versions of dependencies be installed for both the package on PyPI (which is specified in setup.py) and the package on Conda-Forge (which is specified in meta.yaml) -- as well as in the development environment (specified in environment.yml).

OK. This sounds sensible. And this approach makes the treatment of paramtools consistent with the treatment of numpy, etc. Thanks for the "best practices" link.

@martinholmer
Copy link
Collaborator

martinholmer commented Dec 8, 2024

@jdebacker, PR #2846 does not revise the static method Policy.tmd_constructor to work with the new flexible LAST_BUDGET_YEAR logic.

Changes suggested below need to revised somewhat given new Policy.tmd_constructor static method implemented in PR #2850.

You need to make changes along the lines suggested below:

     @staticmethod
-    def tmd_constructor(growfactors_path):  # pragma: no cover
+    def tmd_constructor(growfactors_path,
+                        last_budget_year=DEFAULT_LAST_BUDGET_YEAR,
+    ):  # pragma: no cover
         """
         Static method returns a Policy object instantiated with TMD
         input data.  This convenience method works in a analogous way
@@ -112,7 +114,10 @@ class Policy(Parameters):
         assert isinstance(growfactors_path, Path)
         gf_filename = str(growfactors_path)
         tmd_growfactors = GrowFactors(growfactors_filename=gf_filename)
-        return Policy(gfactors=tmd_growfactors)
+        return Policy(
+            gfactors=tmd_growfactors,
+            last_budget_year=last_budget_year,
+        )
 
     @staticmethod
     def read_json_reform(obj):

@martinholmer
Copy link
Collaborator

@jdebacker, I'll resume my review of PR #2846 after you make the changes mentioned in the following two comments:
#2846 (comment)
#2846 (comment)

@martinholmer
Copy link
Collaborator

martinholmer commented Dec 11, 2024

Using PR #2846 after commit 71b228e, my testing of the Python Tax-Calculator API suggests that the new flexible LAST_BUDGET_YEAR logic is working (details below), but that the Command-Line API does not incorporate the new logic.

Here is what I did after running ./gitpr 2846 to get a local pr-2846 branch:

(taxcalc-dev) tmd% pushd ~/work/Tax-Calculator; make package; git branch; popd
Obtaining file:///Users/mrh/work/Tax-Calculator
  Installing build dependencies ... done
  Checking if build backend supports build_editable ... done
  Getting requirements to build editable ... done
  Preparing editable metadata (pyproject.toml) ... done
---[SNIP]---
Requirement already satisfied: paramtools>=0.19.0
---[SNIP]---
Building wheels for collected packages: taxcalc
  Building editable for taxcalc (pyproject.toml) ... done
  Created wheel for taxcalc: filename=taxcalc-4.3.4-0.editable-py3-none-any.whl size=4797 sha256=c72a2b64e0110e34a475db4f9c76854f0a837ad3449aeca070dc9f2f289e086b
  Stored in directory: /private/var/folders/1q/3c2zxn3n3tz2lxp5h842dkhh0000gn/T/pip-ephem-wheel-cache-9f7uaun7/wheels/3e/36/68/d2d5abc503fef043ec91135d17e8ea410eefb3eff54f10238b
Successfully built taxcalc
Installing collected packages: taxcalc
Successfully installed taxcalc-4.3.4
  master
* pr-2846
  thru74


(taxcalc-dev) tmd% cat iitax2070.py 
"""
Script for testing Python API to Tax-Calculator when using TMD input data.
"""

import sys
from pathlib import Path
import taxcalc as tc

TAX_CALC_YEAR = 2070
REFORM_YEAR = TAX_CALC_YEAR - 10
USE_REFORM = True
REFORM = {"II_em": {REFORM_YEAR: 30e3}}
USE_ASSUMP_GDF = True
ASSUMP_GDF = {"AWAGE": {(REFORM_YEAR+1): 0.02, (REFORM_YEAR+2): 0.0}}
USE_ASSUMP_CON = True
ASSUMP_CON = {"MPC_e18400": {TAX_CALC_YEAR: 0.2}}

ID_PATH = Path(".") / "tmd.csv"
WT_PATH = Path(".") / "tmd_weights.csv.gz"
GF_PATH = Path(".") / "tmd_growfactors.csv"

# compute number of policy years
NUM_YEARS = TAX_CALC_YEAR - tc.Policy.JSON_START_YEAR + 1

# show nature of testing
print("TAX_CALC_YEAR=", TAX_CALC_YEAR)
print("NUM_YEARS=", NUM_YEARS)
print("USE_ASSUMP_GDF=", USE_ASSUMP_GDF)
if USE_ASSUMP_GDF:
    print("ASSUMP_GDF=", ASSUMP_GDF)
print("USE_REFORM=", USE_REFORM)
if USE_REFORM:
    print("REFORM=", REFORM)
print("USE_ASSUMP_CON=", USE_ASSUMP_CON)
if USE_ASSUMP_CON:
    print("ASSUMP_CON=", ASSUMP_CON)

# construct GrowFactors object
tmd_gf = tc.GrowFactors(growfactors_filename=str(GF_PATH))
if USE_ASSUMP_GDF:
    gdiff = tc.GrowDiff(nyrs=NUM_YEARS)
    gdiff.update_growdiff(ASSUMP_GDF)
    gdiff.apply_to(tmd_gf)

# construct Policy object
pol = tc.Policy(gfactors=tmd_gf, last_budget_year=TAX_CALC_YEAR)
assert gdiff.num_years == pol.num_years, "gdiff/pol num years difference"
if USE_REFORM:
    pol.implement_reform(REFORM)

# construct Records object
rec = tc.Records.tmd_constructor(
    data_path=ID_PATH,
    weights_path=WT_PATH,
    growfactors=tmd_gf,
)

# construct Consumption object
con = tc.Consumption(nyrs=NUM_YEARS)
assert con.num_years == pol.num_years, "con/pol num years difference"
if USE_ASSUMP_CON:
    con.update_consumption(ASSUMP_CON)

# construct Calculator object and calculate iitax for TAX_CALC_YEAR
cal = tc.Calculator(policy=pol, records=rec, consumption=con)
cal.advance_to_year(TAX_CALC_YEAR)
cal.calc_all()
print("weighted_iitax_in_2070($B)=", cal.weighted_total("iitax") * 1e-9)
_, mtr_inctax, _ = cal.mtr(calc_all_already_called=True,
                           wrt_full_compensation=False)
print("unweighted_mean_iitax_mtr(%)=", mtr_inctax.mean() * 100)

sys.exit(0)


(taxcalc-dev) tmd% python iitax2070.py   
TAX_CALC_YEAR= 2070
NUM_YEARS= 58
USE_ASSUMP_GDF= True
ASSUMP_GDF= {'AWAGE': {2061: 0.02, 2062: 0.0}}
USE_REFORM= True
REFORM= {'II_em': {2060: 30000.0}}
USE_ASSUMP_CON= True
ASSUMP_CON= {'MPC_e18400': {2070: 0.2}}
weighted_iitax_in_2070($B)= 16381.71847813919
unweighted_mean_iitax_mtr(%)= 22.414417978948155


(taxcalc-dev) tmd% tc tmd.csv 2070       
ERROR: tax_year 2070 greater than policy.end_year 2034
USAGE: tc --help

Reviewing the proposed PR #2846 code changes, it seems as if the CLI code in taxcalcio.py has not been modified to use the new flexible LAST_BUDGET_YEAR logic.

@jdebacker
Copy link
Member Author

@martinholmer Thanks for finding this omission. I'm not a taxcalc CLI user, but I will make the necessary changes there.

@martinholmer
Copy link
Collaborator

@jdebacker, The new flexible LAST_BUDGET_YEAR logic is sufficiently complicated that I think it should have a test. I suggest you add the following test at the bottom of the taxcalc/tests/test_cpscsv.py module:

def test_flexible_last_budget_year(cps_fullsample):
    """
    Test flexible LAST_BUDGET_YEAR logic using cps.csv file.
    """
    tax_calc_year = Policy.DEFAULT_LAST_BUDGET_YEAR - 1
    growdiff_year = tax_calc_year - 1
    growdiff_dict = {'AWAGE': {growdiff_year: 0.01, tax_calc_year: 0.0}}

    def default_calculator(growdiff_dictionary):
        """
        Return CPS-based Calculator object using default LAST_BUDGET_YEAR.
        """
        g_factors = GrowFactors()
        gdiff = GrowDiff()
        gdiff.update_growdiff(growdiff_dictionary)
        gdiff.apply_to(g_factors)
        pol = Policy(gfactors=g_factors)
        rec = Records.cps_constructor(data=cps_fullsample, gfactors=g_factors)
        calc = Calculator(policy=pol, records=rec)
        return calc

    def flexible_calculator(growdiff_dictionary, last_b_year, num_years):
        """
        Return CPS-based Calculator object using custom LAST_BUDGET_YEAR.
        """
        g_factors = GrowFactors()
        gdiff = GrowDiff(nyrs=num_years)
        gdiff.update_growdiff(growdiff_dictionary)
        gdiff.apply_to(g_factors)
        pol = Policy(gfactors=g_factors, last_budget_year=last_b_year)
        rec = Records.cps_constructor(data=cps_fullsample, gfactors=g_factors)
        calc = Calculator(policy=pol, records=rec)
        return calc

    # begin main test logic
    cdef = default_calculator(growdiff_dict)
    cdef.advance_to_year(tax_calc_year)
    cdef.calc_all()
    iitax_def = round(cdef.weighted_total('iitax'))

    num_years = tax_calc_year - Policy.JSON_START_YEAR + 1
    cflx = flexible_calculator(growdiff_dict, tax_calc_year, num_years)
    cflx.advance_to_year(tax_calc_year)
    cflx.calc_all()
    iitax_flx = round(cflx.weighted_total('iitax'))

    assert np.allclose([iitax_flx], [iitax_def])

@martinholmer
Copy link
Collaborator

@jdebacker, No need to do "add budget year end to command line args" in commit 2593f2b

Just use the existing TAXYEAR CLI argument as the "budget end year".

@martinholmer
Copy link
Collaborator

martinholmer commented Dec 15, 2024

@jdebacker, Note sure why you're adding the requests package to Tax-Calculator requirements. It is not used anywhere in the Tax-Calculator code. It was carelessly imported at the top of the parameters.py file even though it not used in the code.
PR #2852 fixes that by removing the unneeded import requests statement in that file.

(taxcalc-dev) Tax-Calculator% git branch
* 4-3-5
  master
  thru74
(taxcalc-dev) Tax-Calculator% grep -r --include="*" requests .
./taxcalc/reforms/TCJA.md:Tax-Calculator 0.14.3.  See pull requests #1818 and #1819 for details
./docs/about/releases.md:See commit history for pull requests before
./setup.py:        "requests",                      <<<<<<<<<<<<<< UNNEEDED
./taxcalc.egg-info/PKG-INFO:Requires-Dist: requests <<<<<<<<<<<<<< OBSOLETE
./taxcalc.egg-info/requires.txt:requests            <<<<<<<<<<<<<< OBSOLETE
(taxcalc-dev) Tax-Calculator% 

@jdebacker
Copy link
Member Author

Closing in favor or @martinholmer's PR #2856

@jdebacker jdebacker closed this Dec 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants