Skip to content

Commit

Permalink
Update GitHub Pages
Browse files Browse the repository at this point in the history
  • Loading branch information
github-actions committed Jan 28, 2025
1 parent 655b089 commit d2e943e
Show file tree
Hide file tree
Showing 5 changed files with 20 additions and 25 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,3 +19,4 @@ Fri Nov 15 13:46:13 UTC 2024
Mon Dec 2 21:30:54 UTC 2024
Fri Dec 6 14:34:48 UTC 2024
Mon Dec 9 18:51:57 UTC 2024
Tue Jan 28 10:21:33 UTC 2025
2 changes: 1 addition & 1 deletion howtos/howto_torch2zml/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -284,7 +284,7 @@ <h2 id="loading-an-individual-layer">Loading an individual layer</h2><p>In the a

<span class="type qualifier">var</span> <span class="variable">ctx</span> = <span class="operator">try</span> <span class="variable">zml</span><span class="punctuation delimiter">.</span><span class="field">Context</span><span class="punctuation delimiter">.</span><span class="function">init</span><span class="punctuation bracket">(</span><span class="punctuation bracket">)</span><span class="punctuation delimiter">;</span>
<span class="keyword">defer</span> <span class="variable">ctx</span><span class="punctuation delimiter">.</span><span class="function">deinit</span><span class="punctuation bracket">(</span><span class="punctuation bracket">)</span><span class="punctuation delimiter">;</span>
<span class="type qualifier">const</span> <span class="variable">platform</span> = <span class="variable">ctx</span><span class="punctuation delimiter">.</span><span class="function">autoPlatform</span><span class="punctuation bracket">(</span><span class="punctuation bracket">)</span><span class="punctuation delimiter">;</span>
<span class="type qualifier">const</span> <span class="variable">platform</span> = <span class="variable">ctx</span><span class="punctuation delimiter">.</span><span class="function">autoPlatform</span><span class="punctuation bracket">(</span><span class="punctuation delimiter">.</span><span class="punctuation bracket">{</span><span class="punctuation bracket">}</span><span class="punctuation bracket">)</span><span class="punctuation delimiter">;</span>
<span class="type qualifier">const</span> <span class="variable">mlp_weights</span> = <span class="operator">try</span> <span class="variable">zml</span><span class="punctuation delimiter">.</span><span class="field">aio</span><span class="punctuation delimiter">.</span><span class="function">loadModelBuffers</span><span class="punctuation bracket">(</span><span class="variable">Mlp</span><span class="punctuation delimiter">,</span> <span class="variable">mlp_shape</span><span class="punctuation delimiter">,</span> <span class="variable">model_weights</span><span class="punctuation delimiter">,</span> <span class="variable">allocator</span><span class="punctuation delimiter">,</span> <span class="variable">platform</span><span class="punctuation bracket">)</span><span class="punctuation delimiter">;</span>

<span class="variable">zml</span><span class="punctuation delimiter">.</span><span class="field">testing</span><span class="punctuation delimiter">.</span><span class="function">testLayer</span><span class="punctuation bracket">(</span><span class="variable">platform</span><span class="punctuation delimiter">,</span> <span class="variable">activations</span><span class="punctuation delimiter">,</span> <span class="string">&quot;model.layers.0.mlp&quot;</span><span class="punctuation delimiter">,</span> <span class="variable">mlp_shape</span><span class="punctuation delimiter">,</span> <span class="variable">mlp_weights</span><span class="punctuation delimiter">,</span> <span class="float">1e-3</span><span class="punctuation bracket">)</span><span class="punctuation delimiter">;</span>
Expand Down
Binary file modified sources.tar
Binary file not shown.
25 changes: 11 additions & 14 deletions tutorials/getting_started/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -157,26 +157,23 @@ <h3 class="centered"></h3>
<div id="docs"><h1 id="getting-started-with-zml">Getting Started with ZML</h1><p>In this tutorial, we will install <code>ZML</code> and run a few models locally.</p><h2 id="prerequisites">Prerequisites</h2><p>First, let's checkout the ZML codebase. In a terminal, run:</p><pre><code>git clone https://github.com/zml/zml.git
cd zml/
</code></pre><p>We use <code>bazel</code> to build ZML and its dependencies. We recommend to download it through <code>bazelisk</code>, a version manager for <code>bazel</code>.</p><h3 id="install-bazel">Install Bazel:</h3><p><strong>macOs:</strong></p><pre><code> brew install bazelisk
</code></pre><p><strong>Linux:</strong></p><pre><code> curl -L -o /usr/local/bin/bazel &apos;https://github.com/bazelbuild/bazelisk/releases/download/v1.20.0/bazelisk-linux-amd64&apos;
</code></pre><p><strong>Linux:</strong></p><pre><code> curl -L -o /usr/local/bin/bazel &apos;https://github.com/bazelbuild/bazelisk/releases/download/v1.25.0/bazelisk-linux-amd64&apos;
chmod +x /usr/local/bin/bazel
</code></pre><h2 id="run-a-pre-packaged-model">Run a pre-packaged model</h2><p>ZML comes with a variety of model examples. See also our reference implementations in the <a href="https://github.com/zml/zml/tree/master/examples/" target="_blank">examples</a> folder.</p><h3 id="mnist">MNIST</h3><p>The <a href="https://en.wikipedia.org/wiki/MNIST_database" target="_blank">classic</a> handwritten digits recognition task. The model is tasked to recognize a handwritten digit, which has been converted to a 28x28 pixel monochrome image. <code>Bazel</code> will download a pre-trained model, and the test dataset. The program will load the model, compile it, and classify a randomly picked example from the test dataset.</p><p>On the command line:</p><pre><code>cd examples
bazel run -c opt //mnist
</code></pre><h3 id="llama">Llama</h3><p>Llama is a family of "Large Language Models", trained to generate text, based on the beginning of a sentence/book/article. This "beginning" is generally referred to as the "prompt".</p><h4 id="tinyllama--stories-15m">TinyLlama, Stories 15M</h4><p>To start, you can use a small model trained specifically on children's history books. This model has been trained by <a href="https://x.com/karpathy" target="_blank">Andrej Karpathy</a>; you can read more about it on his <a href="https://github.com/karpathy/llama2.c" target="_blank">Github</a>.</p><pre><code>cd examples
bazel run -c opt //llama:TinyLlama-Stories-15M
bazel run -c opt //llama:TinyLlama-Stories-15M -- --prompt=&quot;Once upon a time, there was a cute little dragon&quot;
</code></pre><h4 id="openllama-3b">OpenLLama 3B</h4><pre><code>cd examples
bazel run -c opt //llama:OpenLLaMA-3B
bazel run -c opt //llama:OpenLLaMA-3B -- --prompt=&quot;Once upon a time,&quot;
</code></pre><h4 id="meta-llama-3-8b">Meta Llama 3 8B</h4><p>This model has restrictions, see <a href="https://huggingface.co/meta-llama/Meta-Llama-3-8B" target="_blank">here</a>: it <strong>requires approval from Meta on Huggingface</strong>, which can take a few hours to get granted.</p><p>While waiting for approval, you can already <a href="/howtos/huggingface_access_token/">generate your Huggingface access token</a>.</p><p>Once you've been granted access, you're ready to download a gated model like <code>Meta-Llama-3-8b</code>!</p><pre><code># requires token in $HOME/.cache/huggingface/token, as created by the
</code></pre><h3 id="llama">Llama</h3><p>Llama is a family of "Large Language Models", trained to generate text, based on the beginning of a sentence/book/article. This "beginning" is generally referred to as the "prompt".</p><h4 id="meta-llama-3-1-8b">Meta Llama 3.1 8B</h4><p>This model has restrictions, see <a href="https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct" target="_blank">here</a>. It <strong>requires approval from Meta on Huggingface</strong>, which can take a few hours to get granted.</p><p>While waiting for approval, you can already <a href="/howtos/huggingface_access_token/">generate your Huggingface access token</a>.</p><p>Once you've been granted access, you're ready to download a gated model like <code>Meta-Llama-3.1-8B-Instruct</code>!</p><pre><code># requires token in $HOME/.cache/huggingface/token, as created by the
# `huggingface-cli login` command, or the `HUGGINGFACE_TOKEN` environment variable.
cd examples
bazel run -c opt //llama:Meta-Llama-3-8b
bazel run -c opt //llama:Meta-Llama-3-8b -- --promt=&quot;Once upon a time,&quot;
</code></pre><h2 id="run-tests">Run Tests</h2><pre><code>bazel test //zml:test
bazel run -c opt //llama:Llama-3.1-8B-Instruct
bazel run -c opt //llama:Llama-3.1-8B-Instruct -- --prompt=&quot;What is the capital of France?&quot;
</code></pre><p>You can also try <code>Llama-3.1-70B-Instruct</code> if you have enough memory.</p><h3 id="meta-llama-3-2-1b">Meta Llama 3.2 1B</h3><p>Like the 8B model above, this model also requires approval. See <a href="https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct" target="_blank">here</a> for access requirements.</p><pre><code>cd examples
bazel run -c opt //llama:Llama-3.2-1B-Instruct
bazel run -c opt //llama:Llama-3.2-1B-Instruct -- --prompt=&quot;What is the capital of France?&quot;
</code></pre><p>For a larger 3.2 model, you can also try <code>Llama-3.2-3B-Instruct</code>.</p><h2 id="run-tests">Run Tests</h2><pre><code>bazel test //zml:test
</code></pre><h2 id="running-models-on-gpu---tpu">Running Models on GPU / TPU</h2><p>You can compile models for accelerator runtimes by appending one or more of the following arguments to the command line when compiling or running a model:</p><ul><li>NVIDIA CUDA: <code>--@zml//runtimes:cuda=true</code></li><li>AMD RoCM: <code>--@zml//runtimes:rocm=true</code></li><li>Google TPU: <code>--@zml//runtimes:tpu=true</code></li><li>AWS Trainium/Inferentia 2: <code>--@zml//runtimes:neuron=true</code></li><li><strong>AVOID CPU:</strong> <code>--@zml//runtimes:cpu=false</code></li></ul><p>The latter, avoiding compilation for CPU, cuts down compilation time.</p><p>So, to run the OpenLLama model from above on your host sporting an NVIDIA GPU, run the following:</p><pre><code>cd examples
bazel run -c opt //llama:OpenLLaMA-3B \
--@zml//runtimes:cuda=true \
-- --prompt=&quot;Once upon a time,&quot;
bazel run -c opt //llama:Llama-3.2-1B-Instruct \
--@zml//runtimes:cuda=true \
-- --prompt=&quot;What is the capital of France?&quot;
</code></pre><h2 id="where-to-go-next">Where to go next:</h2><p>In <a href="/howtos/deploy_on_server/">Deploying Models on a Server</a>, we show how you can cross-compile and package for a specific architecture, then deploy and run your model. Alternatively, you can also <a href="/howtos/dockerize_models/">dockerize</a> your model.</p><p>You might also want to check out the <a href="https://github.com/zml/zml/tree/master/examples" target="_blank">examples</a>, read through the <a href="/">documentation</a>, start <a href="/tutorials/write_first_model/">writing your first model</a>, or read about more high-level <a href="/learn/concepts/">ZML concepts</a>.</p></div>
</div>
</main>
Expand Down
Loading

0 comments on commit d2e943e

Please sign in to comment.