Skip to content

WuZhuoran/numpy-ml

This branch is 213 commits behind ddbourgin/numpy-ml:master.

Folders and files

NameName
Last commit message
Last commit date

Latest commit

0db53b6 · Jul 9, 2019
Jul 8, 2019
Jul 8, 2019
Apr 21, 2019
Apr 11, 2019
Jul 3, 2019
Jul 9, 2019
Jul 4, 2019
Jun 30, 2019
Jul 7, 2019
May 30, 2019
Jun 30, 2019
Jul 7, 2019
Apr 10, 2019
Jul 8, 2019
Jul 9, 2019
Apr 5, 2019
Jul 7, 2019

Repository files navigation

numpy-ml

Ever wish you had an inefficient but somewhat legible collection of machine learning algorithms implemented exclusively in numpy? No?

Models

This repo includes code for the following models:

  1. Gaussian mixture model

    • EM training
  2. Hidden Markov model

    • Viterbi decoding
    • Likelihood computation
    • MLE parameter estimation via Baum-Welch/forward-backward algorithm
  3. Latent Dirichlet allocation (topic model)

    • Standard model with MLE parameter estimation via variational EM
    • Smoothed model with MAP parameter estimation via MCMC
  4. Neural networks

    • Layers / Layer-wise ops
      • Add
      • Flatten
      • Multiply
      • Softmax
      • Fully-connected/Dense
      • Sparse evolutionary connections
      • LSTM
      • Elman-style RNN
      • Max + average pooling
      • Dot-product attention
      • Restricted Boltzmann machine (w. CD-n training)
      • 2D deconvolution (w. padding and stride)
      • 2D convolution (w. padding, dilation, and stride)
      • 1D convolution (w. padding, dilation, stride, and causality)
    • Modules
      • Bidirectional LSTM
      • ResNet-style residual blocks (identity and convolution)
      • WaveNet-style residual blocks with dilated causal convolutions
      • Transformer-style multi-headed scaled dot product attention
    • Regularizers
      • Dropout
    • Normalization
      • Batch normalization (spatial and temporal)
      • Layer normalization (spatial and temporal)
    • Optimizers
      • SGD w/ momentum
      • AdaGrad
      • RMSProp
      • Adam
    • Learning Rate Schedulers
      • Constant
      • Exponential
      • Noam/Transformer
      • Dlib scheduler
    • Weight Initializers
      • Glorot/Xavier uniform and normal
      • He/Kaiming uniform and normal
      • Standard and truncated normal
    • Losses
      • Cross entropy
      • Squared error
      • Bernoulli VAE loss
      • Wasserstein loss with gradient penalty
    • Activations
      • ReLU
      • Tanh
      • Affine
      • Sigmoid
      • Leaky ReLU
      • ELU
    • Models
      • Bernoulli variational autoencoder
      • Wasserstein GAN with gradient penalty
    • Utilities
      • col2im (MATLAB port)
      • im2col (MATLAB port)
      • conv1D
      • conv2D
      • deconv2D
      • minibatch
  5. Tree-based models

    • Decision trees (CART)
    • [Bagging] Random forests
    • [Boosting] Gradient-boosted decision trees
  6. Linear models

    • Ridge regression
    • Logistic regression
    • Ordinary least squares
    • Bayesian linear regression w/ conjugate priors
      • Unknown mean, known variance (Gaussian prior)
      • Unknown mean, unknown variance (Normal-Gamma / Normal-Inverse-Wishart prior)
  7. n-Gram sequence models

    • Maximum likelihood scores
    • Additive/Lidstone smoothing
    • Simple Good-Turing smoothing
  8. Reinforcement learning models

    • Cross-entropy method agent
    • First visit on-policy Monte Carlo agent
    • Weighted incremental importance sampling Monte Carlo agent
    • Expected SARSA agent
    • TD-0 Q-learning agent
    • Dyna-Q / Dyna-Q+ with prioritized sweeping
  9. Nonparameteric models

    • Nadaraya-Watson kernel regression
    • k-Nearest neighbors classification and regression
  10. Preprocessing

    • Discrete Fourier transform (1D signals)
    • Discrete cosine transform (type-II) (1D signals)
    • Bilinear interpolation (2D signals)
    • Nearest neighbor interpolation (1D and 2D signals)
    • Autocorrelation (1D signals)
    • Signal windowing
    • Text tokenization
    • Feature hashing
    • Feature standardization
    • One-hot encoding / decoding
    • Huffman coding / decoding
    • Term frequency-inverse document frequency encoding
    • MFCC encoding
  11. Utilities

    • Similarity kernels
    • Distance metrics
    • Priority queues
    • Ball tree data structure

Contributing

Am I missing your favorite model? Is there something that could be cleaner / less confusing? Did I mess something up? Submit a PR! The only requirement is that your models are written with just the Python standard library and numpy. The SciPy library is also permitted under special circumstances ;)

See full contributing guidelines here.

About

Machine learning, in numpy

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%