Consider adding examples #47

munckymagik · 2019-05-11T11:59:17Z

I suggest we consider adding an examples folder to demonstrate more real world usage.

The benefits I think this would bring are:

By seeing the library used in a more realistic situation we may learn things about the design that the tests/doc-tests didn't reveal.
We help users get started more quickly for typical use-cases.
We get to set up some good usage patterns for others to follow.

Can we brain-storm a list of the kinds of examples we would want?

Does anybody have any toy examples we could use to seed the folder?

The text was updated successfully, but these errors were encountered:

LukeMathWalker · 2019-07-10T07:13:48Z

I finally have time to go back to this ❤️

We could start by porting some of the examples in the Scipy cookbook.
Considering that we have focused on the stats section of Scipy, these ones could be relevant to us:

Fisher's linear discriminant
Data fitting (leveraging ndarray-stats together with argmin

Do you know any other collections of examples we can poach from @munckymagik?

munckymagik · 2019-07-18T08:30:59Z

@LukeMathWalker sorry for the delay, some good ones here maybe: https://github.com/ddbourgin/numpy-ml.

Should we create a checklist organised by the major feature areas in our crate, and try to propose at least one compelling example for each? I realise there might be some cross over so one example may cover several areas.

I'm going to need you to guide as to what kinds of examples would suit our crate and be good starting points for users with real problems to solve. Maybe a mix of common-use items plus something more niche? IDK maybe regression, f-score, p-values, confidence intervals etc. (???)

Then there's what to do about sample data. I found:

RustLearn's example downloads a dataset from an archive
rustml::datatsets provides generators and the nmist dataset
openml-rust is a client for http://openml.org/

What do you think?

munckymagik · 2019-07-18T11:48:11Z

Also, have you used any of the rust plotting libraries?

LukeMathWalker · 2019-07-20T15:25:40Z

We though of the same repo there - I have drafted a first linear regression example using ndarray-linalg and ndarray-stats, see rust-ndarray/ndarray-linalg#166
I have filed the PR against ndarray-linalg because it might be a little complicated to deal with the BLAS backend in this crate, but we can sort it out if we wanted to.

The examples you mentioned are good starting point. In terms of ML, we could have a look at some stuff in the preprocessing space:

given a (n_samples, n_features) input matrix, keep only the columns whose variance is above a threshold (ScikitLearn equivalent);
given a (n_samples, n_features) input matrix, recursively remove the columns who have a pearson correlation score above a certain threshold;
multiclass logistic regression, using our cross_entropy method as loss function.

Given the nature of our crate, I think that to make it shine we need examples that do require vectorised operations - ML is a fantastic domain for these purposes.

For datasets, I think we can either generate them (as in my linear regression example) or we could use openml-rust, it seems sufficiently plug-and-play.

I opened a thread a while ago on Reddit for plotting libraries, but none of those I have seen so far seemed mature enough. I don't know if the landscape has changed significantly since.

munckymagik · 2019-08-09T09:31:35Z

Shutting this crate-specific issue now. Work to build examples for the ndarray ecosystem is happening in https://github.com/rust-ndarray/ndarray-examples.

LukeMathWalker mentioned this issue May 18, 2019

Add deviation functions #41

Merged

20 tasks

LukeMathWalker mentioned this issue Jul 17, 2019

Linear regression example rust-ndarray/ndarray-linalg#166

Closed

munckymagik closed this as completed Aug 9, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider adding examples #47

Consider adding examples #47

munckymagik commented May 11, 2019

LukeMathWalker commented Jul 10, 2019

munckymagik commented Jul 18, 2019

munckymagik commented Jul 18, 2019

LukeMathWalker commented Jul 20, 2019

munckymagik commented Aug 9, 2019

Consider adding examples #47

Consider adding examples #47

Comments

munckymagik commented May 11, 2019

LukeMathWalker commented Jul 10, 2019

munckymagik commented Jul 18, 2019

munckymagik commented Jul 18, 2019

LukeMathWalker commented Jul 20, 2019

munckymagik commented Aug 9, 2019