File tree 6 files changed +12
-0
lines changed
6 files changed +12
-0
lines changed Original file line number Diff line number Diff line change 12
12
class CatBoostEncoder (BaseEstimator , util .TransformerWithTargetMixin ):
13
13
"""CatBoost coding for categorical features.
14
14
15
+ Supported targets: binomial and continuous. For polynomial target support, see PolynomialWrapper.
16
+
15
17
This is very similar to leave-one-out encoding, but calculates the
16
18
values "on-the-fly". Consequently, the values naturally vary
17
19
during the training phase and it is not necessary to add random noise.
Original file line number Diff line number Diff line change 16
16
class GLMMEncoder (BaseEstimator , util .TransformerWithTargetMixin ):
17
17
"""Generalized linear mixed model.
18
18
19
+ Supported targets: binomial and continuous. For polynomial target support, see PolynomialWrapper.
20
+
19
21
This is a supervised encoder similar to TargetEncoder or MEstimateEncoder, but there are some advantages:
20
22
1) Solid statistical theory behind the technique. Mixed effects models are a mature branch of statistics.
21
23
2) No hyper-parameters to tune. The amount of shrinkage is automatically determined through the estimation process.
Original file line number Diff line number Diff line change 14
14
class JamesSteinEncoder (BaseEstimator , util .TransformerWithTargetMixin ):
15
15
"""James-Stein estimator.
16
16
17
+ Supported targets: binomial and continuous. For polynomial target support, see PolynomialWrapper.
18
+
17
19
For feature value `i`, James-Stein estimator returns a weighted average of:
18
20
19
21
1. The mean target value for the observed feature value `i`.
Original file line number Diff line number Diff line change 12
12
class MEstimateEncoder (BaseEstimator , util .TransformerWithTargetMixin ):
13
13
"""M-probability estimate of likelihood.
14
14
15
+ Supported targets: binomial and continuous. For polynomial target support, see PolynomialWrapper.
16
+
15
17
This is a simplified version of target encoder, which goes under names like m-probability estimate or
16
18
additive smoothing with known incidence rates. In comparison to target encoder, m-probability estimate
17
19
has only one tunable parameter (`m`), while target encoder has two tunable parameters (`min_samples_leaf`
Original file line number Diff line number Diff line change 11
11
class TargetEncoder (BaseEstimator , util .TransformerWithTargetMixin ):
12
12
"""Target encoding for categorical features.
13
13
14
+ Supported targets: binomial and continuous. For polynomial target support, see PolynomialWrapper.
15
+
14
16
For the case of categorical target: features are replaced with a blend of posterior probability of the target
15
17
given particular categorical value and the prior probability of the target over all the training data.
16
18
Original file line number Diff line number Diff line change 12
12
class WOEEncoder (BaseEstimator , util .TransformerWithTargetMixin ):
13
13
"""Weight of Evidence coding for categorical features.
14
14
15
+ Supported targets: binomial. For polynomial target support, see PolynomialWrapper.
16
+
15
17
Parameters
16
18
----------
17
19
You can’t perform that action at this time.
0 commit comments