Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix KNNImputer #1165

Open
wants to merge 17 commits into
base: main
Choose a base branch
from
Open

Fix KNNImputer #1165

wants to merge 17 commits into from

Conversation

xadupre
Copy link
Collaborator

@xadupre xadupre commented Feb 4, 2025

No description provided.

@ashtarimo
Copy link

KNNImputer converter doesn't generate the same values for the imputed missing values. I am using:

onnxruntim 1.18.1, onnx 1.17.0, and skl2onnx 1.17.0

Here is the code:

import numpy as np
import onnx
import skl2onnx
from skl2onnx import convert_sklearn
from skl2onnx.common.data_types import FloatTensorType
from sklearn.impute import KNNImputer

np.random.seed(42)
data = np.random.randn(100, 10)
data[99, 9] = np.nan
imputer = KNNImputer(n_neighbors= 5)
imputer.fit(data)
dataft = imputer.transform(data)
initial_type = [('float_input', FloatTensorType([None, data.shape[1]]))]
onnx_model = convert_sklearn(imputer, initial_types=initial_type)
with open("knn_imputer.onnx", "wb") as f:
f.write(onnx_model.SerializeToString())
sess = rt.InferenceSession("knn_imputer.onnx")
input_data = data.astype(np.float32)
input_name = sess.get_inputs()[0].name
output_name = sess.get_outputs()[0].name
res = sess.run([output_name], {input_name: input_data})

The KNNIMputer generates value of -0.67440104 for the missing value, and the onnx model, generates -1.0071397 for the same missing value

Signed-off-by: xadupre <[email protected]>
@xadupre xadupre changed the title Add failing unit test for KNNImputer Fix KNNImputer Feb 18, 2025
Signed-off-by: xadupre <[email protected]>
):
gr = ModelComponentContainer({"": g.main_opset}, as_function=True)
donors_mask = gr.make_tensor_input("donors_mask")
_donors = gr.make_tensor_input("donors")

Check notice

Code scanning / CodeQL

Unused local variable Note

Variable _donors is not used.
def _get_torch_knn_imputer(self):
import torch

def _get_weights(dist, weights):

Check notice

Code scanning / CodeQL

Explicit returns mixed with implicit (fall through) returns Note test

Mixing implicit and explicit returns may indicate an error as implicit returns always return None.
)
i_col = gr.make_tensor_input("i_col")
x = gr.make_tensor_input("x")
_dist_subset = gr.make_tensor_input("dist_subset")

Check notice

Code scanning / CodeQL

Unused local variable Note

Variable _dist_subset is not used.
Signed-off-by: xadupre <[email protected]>
Signed-off-by: xadupre <[email protected]>
Signed-off-by: xadupre <[email protected]>
Signed-off-by: xadupre <[email protected]>
Signed-off-by: xadupre <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants