software-mansion · jakmro · Feb 18, 2025 · Jan 21, 2025 · Jan 21, 2025 · Jan 21, 2025
diff --git a/README.md b/README.md
@@ -15,6 +15,52 @@ To run any AI model in ExecuTorch, you need to export it to a `.pte` format. If
 Take a look at how our library can help build you your React Native AI features in our docs:  
 https://docs.swmansion.com/react-native-executorch
 
+
+# 🦙 **Quickstart - Running Llama**  
+
+**Get started with AI-powered text generation in 3 easy steps!**  
+
+### 1️⃣ **Installation**  
+```bash
+# Install the package
+yarn add react-native-executorch
+cd ios && pod install && cd ..
+```
+
+---
+
+### 2️⃣ **Setup & Initialization**  
+Add this to your component file:  
+```tsx
+import { 
+  LLAMA3_2_1B_QLORA, 
+  LLAMA3_2_3B_TOKENIZER,
+  useLLM 
+} from 'react-native-executorch';
+
+function MyComponent() {
+  // Initialize the model 🚀
+  const llama = useLLM({
+    modelSource: LLAMA3_2_1B_QLORA,
+    tokenizerSource: LLAMA3_2_1B_TOKENIZER
+  });
+  // ... rest of your component
+}
+```
+
+---
+
+### 3️⃣ **Run the model!**  
+```tsx
+const handleGenerate = async () => {
+  const prompt = "The meaning of life is";
+
+  // Generate text based on your desired prompt
+  const response = await llama.generate(prompt);
+  console.log("Llama says:", response);
+};
+```
+
 ## Minimal supported versions
 The minimal supported version is 17.0 for iOS and Android 13.
 

diff --git a/docs/docs/benchmarks/_category_.json b/docs/docs/benchmarks/_category_.json
@@ -0,0 +1,7 @@
+{
+  "label": "Benchmarks",
+  "position": 5,
+  "link": {
+    "type": "generated-index"
+  }
+}
diff --git a/docs/docs/benchmarks/inference-time.md b/docs/docs/benchmarks/inference-time.md
@@ -0,0 +1,42 @@
+---
+title: Inference Time
+sidebar_position: 3
+---
+
+:::warning warning
+Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
+:::
+
+## Classification
+
+| Model             | iPhone 16 Pro (Core ML) [ms] | iPhone 13 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| ----------------- | ---------------------------- | ---------------------------- | -------------------------- | --------------------------------- | ------------------------- |
+| EFFICIENTNET_V2_S | 100                          | 120                          | 130                        | 180                               | 170                       |
+
+## Object Detection
+
+| Model                          | iPhone 16 Pro (XNNPACK) [ms] | iPhone 13 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| ------------------------------ | ---------------------------- | ---------------------------- | -------------------------- | --------------------------------- | ------------------------- |
+| SSDLITE_320_MOBILENET_V3_LARGE | 190                          | 260                          | 280                        | 100                               | 90                        |
+
+## Style Transfer
+
+| Model                        | iPhone 16 Pro (Core ML) [ms] | iPhone 13 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| ---------------------------- | ---------------------------- | ---------------------------- | -------------------------- | --------------------------------- | ------------------------- |
+| STYLE_TRANSFER_CANDY         | 450                          | 600                          | 750                        | 1650                              | 1800                      |
+| STYLE_TRANSFER_MOSAIC        | 450                          | 600                          | 750                        | 1650                              | 1800                      |
+| STYLE_TRANSFER_UDNIE         | 450                          | 600                          | 750                        | 1650                              | 1800                      |
+| STYLE_TRANSFER_RAIN_PRINCESS | 450                          | 600                          | 750                        | 1650                              | 1800                      |
+
+## LLMs
+
+| Model                 | iPhone 16 Pro (XNNPACK) [tokens/s] | iPhone 13 Pro (XNNPACK) [tokens/s] | iPhone SE 3 (XNNPACK) [tokens/s] | Samsung Galaxy S24 (XNNPACK) [tokens/s] | OnePlus 12 (XNNPACK) [tokens/s] |
+| --------------------- | ---------------------------------- | ---------------------------------- | -------------------------------- | --------------------------------------- | ------------------------------- |
+| LLAMA3_2_1B           | 16.1                               | 11.4                               | ❌                               | 15.6                                    | 19.3                            |
+| LLAMA3_2_1B_SPINQUANT | 40.6                               | 16.7                               | 16.5                             | 40.3                                    | 48.2                            |
+| LLAMA3_2_1B_QLORA     | 31.8                               | 11.4                               | 11.2                             | 37.3                                    | 44.4                            |
+| LLAMA3_2_3B           | ❌                                 | ❌                                 | ❌                               | ❌                                      | 7.1                             |
+| LLAMA3_2_3B_SPINQUANT | 17.2                               | 8.2                                | ❌                               | 16.2                                    | 19.4                            |
+| LLAMA3_2_3B_QLORA     | 14.5                               | ❌                                 | ❌                               | 14.8                                    | 18.1                            |
+
+❌ - Insufficient RAM.
diff --git a/docs/docs/benchmarks/memory-usage.md b/docs/docs/benchmarks/memory-usage.md
@@ -0,0 +1,36 @@
+---
+title: Memory Usage
+sidebar_position: 2
+---
+
+## Classification
+
+| Model             | Android (XNNPACK) [MB] | iOS (Core ML) [MB] |
+| ----------------- | ---------------------- | ------------------ |
+| EFFICIENTNET_V2_S | 130                    | 85                 |
+
+## Object Detection
+
+| Model                          | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
+| ------------------------------ | ---------------------- | ------------------ |
+| SSDLITE_320_MOBILENET_V3_LARGE | 90                     | 90                 |
+
+## Style Transfer
+
+| Model                        | Android (XNNPACK) [MB] | iOS (Core ML) [MB] |
+| ---------------------------- | ---------------------- | ------------------ |
+| STYLE_TRANSFER_CANDY         | 950                    | 350                |
+| STYLE_TRANSFER_MOSAIC        | 950                    | 350                |
+| STYLE_TRANSFER_UDNIE         | 950                    | 350                |
+| STYLE_TRANSFER_RAIN_PRINCESS | 950                    | 350                |
+
+## LLMs
+
+| Model                 | Android (XNNPACK) [GB] | iOS (XNNPACK) [GB] |
+| --------------------- | ---------------------- | ------------------ |
+| LLAMA3_2_1B           | 3.2                    | 3.1                |
+| LLAMA3_2_1B_SPINQUANT | 1.9                    | 2                  |
+| LLAMA3_2_1B_QLORA     | 2.2                    | 2.5                |
+| LLAMA3_2_3B           | 7.1                    | 7.3                |
+| LLAMA3_2_3B_SPINQUANT | 3.7                    | 3.8                |
+| LLAMA3_2_3B_QLORA     | 4                      | 4.1                |
diff --git a/docs/docs/benchmarks/model-size.md b/docs/docs/benchmarks/model-size.md
@@ -0,0 +1,36 @@
+---
+title: Model Size
+sidebar_position: 1
+---
+
+## Classification
+
+| Model             | XNNPACK [MB] | Core ML [MB] |
+| ----------------- | ------------ | ------------ |
+| EFFICIENTNET_V2_S | 85.6         | 43.9         |
+
+## Object Detection
+
+| Model                          | XNNPACK [MB] |
+| ------------------------------ | ------------ |
+| SSDLITE_320_MOBILENET_V3_LARGE | 13.9         |
+
+## Style Transfer
+
+| Model                        | XNNPACK [MB] | Core ML [MB] |
+| ---------------------------- | ------------ | ------------ |
+| STYLE_TRANSFER_CANDY         | 6.78         | 5.22         |
+| STYLE_TRANSFER_MOSAIC        | 6.78         | 5.22         |
+| STYLE_TRANSFER_UDNIE         | 6.78         | 5.22         |
+| STYLE_TRANSFER_RAIN_PRINCESS | 6.78         | 5.22         |
+
+## LLMs
+
+| Model                 | XNNPACK [GB] |
+| --------------------- | ------------ |
+| LLAMA3_2_1B           | 2.47         |
+| LLAMA3_2_1B_SPINQUANT | 1.14         |
+| LLAMA3_2_1B_QLORA     | 1.18         |
+| LLAMA3_2_3B           | 6.43         |
+| LLAMA3_2_3B_SPINQUANT | 2.55         |
+| LLAMA3_2_3B_QLORA     | 2.65         |
diff --git a/...ocs/computer-vision/useClassification.mdx → ...docs/computer-vision/useClassification.md b/...ocs/computer-vision/useClassification.mdx → ...docs/computer-vision/useClassification.md
@@ -38,12 +38,13 @@ A string that specifies the location of the model binary. For more information,
 
 ### Returns
 
-| Field          | Type                                                         | Description                                                                                              |
-| -------------- | ------------------------------------------------------------ | -------------------------------------------------------------------------------------------------------- |
-| `forward`      | `(input: string) => Promise<{ [category: string]: number }>` | Executes the model's forward pass, where `input` can be a fetchable resource or a Base64-encoded string. |
-| `error`        | <code>string &#124; null</code>                              | Contains the error message if the model failed to load.                                                  |
-| `isGenerating` | `boolean`                                                    | Indicates whether the model is currently processing an inference.                                        |
-| `isReady`      | `boolean`                                                    | Indicates whether the model has successfully loaded and is ready for inference.                          |
+| Field              | Type                                                         | Description                                                                                              |
+| ------------------ | ------------------------------------------------------------ | -------------------------------------------------------------------------------------------------------- |
+| `forward`          | `(input: string) => Promise<{ [category: string]: number }>` | Executes the model's forward pass, where `input` can be a fetchable resource or a Base64-encoded string. |
+| `error`            | <code>string &#124; null</code>                              | Contains the error message if the model failed to load.                                                  |
+| `isGenerating`     | `boolean`                                                    | Indicates whether the model is currently processing an inference.                                        |
+| `isReady`          | `boolean`                                                    | Indicates whether the model has successfully loaded and is ready for inference.                          |
+| `downloadProgress` | `number`                                                     | Represents the download progress as a value between 0 and 1                                              |
 
 ## Running the model
 
@@ -86,3 +87,27 @@ function App() {
 | Model                                                                                                           | Number of classes | Class list                                                                                                                                                                 |
 | --------------------------------------------------------------------------------------------------------------- | ----------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
 | [efficientnet_v2_s](https://pytorch.org/vision/0.20/models/generated/torchvision.models.efficientnet_v2_s.html) | 1000              | [ImageNet1k_v1](https://github.com/software-mansion/react-native-executorch/blob/main/android/src/main/java/com/swmansion/rnexecutorch/models/classification/Constants.kt) |
+
+## Benchmarks
+
+### Model size
+
+| Model             | XNNPACK [MB] | Core ML [MB] |
+| ----------------- | ------------ | ------------ |
+| EFFICIENTNET_V2_S | 85.6         | 43.9         |
+
+### Memory usage
+
+| Model             | Android (XNNPACK) [MB] | iOS (Core ML) [MB] |
+| ----------------- | ---------------------- | ------------------ |
+| EFFICIENTNET_V2_S | 130                    | 85                 |
+
+### Inference time
+
+:::warning warning
+Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
+:::
+
+| Model             | iPhone 16 Pro (Core ML) [ms] | iPhone 13 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| ----------------- | ---------------------------- | ---------------------------- | -------------------------- | --------------------------------- | ------------------------- |
+| EFFICIENTNET_V2_S | 100                          | 120                          | 130                        | 180                               | 170                       |
diff --git a/...cs/computer-vision/useObjectDetection.mdx → ...ocs/computer-vision/useObjectDetection.md b/...cs/computer-vision/useObjectDetection.mdx → ...ocs/computer-vision/useObjectDetection.md
@@ -61,12 +61,13 @@ For more information on that topic, you can check out the [Loading models](https
 
 The hook returns an object with the following properties:
 
-| Field          | Type                                      | Description                                                                              |
-| -------------- | ----------------------------------------- | ---------------------------------------------------------------------------------------- |
-| `forward`      | `(input: string) => Promise<Detection[]>` | A function that accepts an image (url, b64) and returns an array of `Detection` objects. |
-| `error`        | <code>string &#124; null</code>           | Contains the error message if the model loading failed.                                  |
-| `isGenerating` | `boolean`                                 | Indicates whether the model is currently processing an inference.                        |
-| `isReady`      | `boolean`                                 | Indicates whether the model has successfully loaded and is ready for inference.          |
+| Field              | Type                                      | Description                                                                              |
+| ------------------ | ----------------------------------------- | ---------------------------------------------------------------------------------------- |
+| `forward`          | `(input: string) => Promise<Detection[]>` | A function that accepts an image (url, b64) and returns an array of `Detection` objects. |
+| `error`            | <code>string &#124; null</code>           | Contains the error message if the model loading failed.                                  |
+| `isGenerating`     | `boolean`                                 | Indicates whether the model is currently processing an inference.                        |
+| `isReady`          | `boolean`                                 | Indicates whether the model has successfully loaded and is ready for inference.          |
+| `downloadProgress` | `number`                                  | Represents the download progress as a value between 0 and 1                              |
 
 ## Running the model
 
@@ -124,3 +125,27 @@ function App() {
 | Model                                                                                                                                                                                                               | Number of classes | Class list                                                                                                                                          |
 | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------- | --------------------------------------------------------------------------------------------------------------------------------------------------- |
 | [SSDLite320 MobileNetV3 Large](https://pytorch.org/vision/main/models/generated/torchvision.models.detection.ssdlite320_mobilenet_v3_large.html#torchvision.models.detection.SSDLite320_MobileNet_V3_Large_Weights) | 91                | [COCO](https://github.com/software-mansion/react-native-executorch/blob/69802ee1ca161d9df00def1dabe014d36341cfa9/src/types/object_detection.ts#L14) |
+
+## Benchmarks
+
+### Model size
+
+| Model                          | XNNPACK [MB] |
+| ------------------------------ | ------------ |
+| SSDLITE_320_MOBILENET_V3_LARGE | 13.9         |
+
+### Memory usage
+
+| Model                          | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
+| ------------------------------ | ---------------------- | ------------------ |
+| SSDLITE_320_MOBILENET_V3_LARGE | 90                     | 90                 |
+
+### Inference time
+
+:::warning warning
+Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
+:::
+
+| Model                          | iPhone 16 Pro (XNNPACK) [ms] | iPhone 13 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| ------------------------------ | ---------------------------- | ---------------------------- | -------------------------- | --------------------------------- | ------------------------- |
+| SSDLITE_320_MOBILENET_V3_LARGE | 190                          | 260                          | 280                        | 100                               | 90                        |