SurrealML with state of the art models #66

sFritsch09 · 2025-02-05T19:06:52Z

I am wondering if it is possible or planed in the near future to be able to upload local models like Llama 3.2 or Deepseek to import into SurrealDB by converting to surml.
Training a model with sklearn or pytorch is old tech! We can already train models in a much more profound way like unsloth or LlamaFactory or ZenML.

Why starting from scratch to train a model when I can train a strong model which is already pretty good on handling data.

maxwellflitton · 2025-02-06T00:04:53Z

It depends on what you want. For instance, if you're making decisions in finance or insurance, you need to ensure that you are adhering to regulations with backtesting and explainable weights. You're are going to want to use PyTorch, Sklearn, or Tensorflow for these. I myself apply ML at the London centre of bioengineering in surgical robotics and we very much use Pytorch, nothing else. A lot of people I know who are working professionally in ML for academia or industry in central London use PyTorch, Tensorflow, or Sklearn. If we look at the Google trends we can see that pytorch is still widely searched:

Right now I am working on C lib wrappers so we can have better integration with other languages and better deployment. Initially, it makes sense to support the most widely used ML frameworks that are being professionally used, as they have established ecosystems, quality control methods, and the professionals using these frameworks can explain/trace the exact data passed into the model as they want to avoid having legal action and adhere to regulations.

That being said, we can offer support for something like Llama. Machine learning models are essentially math matrix operations where the weights are stored in onnx format which is essentially protobuf to represent the computational graph. We support raw onnx as you can see with the following link:

https://github.com/surrealdb/surrealml?tab=readme-ov-file#raw-onnx-models

onnx is the established standard for storing these computational graphs. This means that any serious machine learning model you come across should have an onnx format. Below is the microsoft repo that explains the theory behind converting Llama to onnx:

https://github.com/microsoft/Llama-2-Onnx

And below is documentation on how they accelerated inference with Llama in the onnxruntime:

https://onnxruntime.ai/blogs/accelerating-llama-2

The surml core engine uses the onnxruntime to execute the model in the database. So if you get an onnx representation of Llama you can run it on surrealML. I've also now looked at the code for unsloth and LlamaFactory , they're wrappers around pytorch. So instead of us maintaining interfaces around unsloth and LlamaFactory which I'm sure will change over time, you should be able to train your model using unsloth and LlamaFactory and convert to surml using the TORCH engine.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SurrealML with state of the art models #66

SurrealML with state of the art models #66

sFritsch09 commented Feb 5, 2025

maxwellflitton commented Feb 6, 2025 •

edited

Loading

SurrealML with state of the art models #66

SurrealML with state of the art models #66

Comments

sFritsch09 commented Feb 5, 2025

maxwellflitton commented Feb 6, 2025 • edited Loading

maxwellflitton commented Feb 6, 2025 •

edited

Loading