Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Hookless-API #90

Merged
merged 11 commits into from
Feb 18, 2025
46 changes: 46 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,52 @@ To run any AI model in ExecuTorch, you need to export it to a `.pte` format. If
Take a look at how our library can help build you your React Native AI features in our docs:
https://docs.swmansion.com/react-native-executorch


# 🦙 **Quickstart - Running Llama**

**Get started with AI-powered text generation in 3 easy steps!**

### 1️⃣ **Installation**
```bash
# Install the package
yarn add react-native-executorch
cd ios && pod install && cd ..
```

---

### 2️⃣ **Setup & Initialization**
Add this to your component file:
```tsx
import {
LLAMA3_2_1B_QLORA,
LLAMA3_2_3B_TOKENIZER,
useLLM
} from 'react-native-executorch';

function MyComponent() {
// Initialize the model 🚀
const llama = useLLM({
modelSource: LLAMA3_2_1B_QLORA,
tokenizerSource: LLAMA3_2_1B_TOKENIZER
});
// ... rest of your component
}
```

---

### 3️⃣ **Run the model!**
```tsx
const handleGenerate = async () => {
const prompt = "The meaning of life is";

// Generate text based on your desired prompt
const response = await llama.generate(prompt);
console.log("Llama says:", response);
};
```

## Minimal supported versions
The minimal supported version is 17.0 for iOS and Android 13.

Expand Down
7 changes: 7 additions & 0 deletions docs/docs/benchmarks/_category_.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
{
"label": "Benchmarks",
"position": 5,
"link": {
"type": "generated-index"
}
}
42 changes: 42 additions & 0 deletions docs/docs/benchmarks/inference-time.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
---
title: Inference Time
sidebar_position: 3
---

:::warning warning
Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
:::

## Classification

| Model | iPhone 16 Pro (Core ML) [ms] | iPhone 13 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
| ----------------- | ---------------------------- | ---------------------------- | -------------------------- | --------------------------------- | ------------------------- |
| EFFICIENTNET_V2_S | 100 | 120 | 130 | 180 | 170 |

## Object Detection

| Model | iPhone 16 Pro (XNNPACK) [ms] | iPhone 13 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
| ------------------------------ | ---------------------------- | ---------------------------- | -------------------------- | --------------------------------- | ------------------------- |
| SSDLITE_320_MOBILENET_V3_LARGE | 190 | 260 | 280 | 100 | 90 |

## Style Transfer

| Model | iPhone 16 Pro (Core ML) [ms] | iPhone 13 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
| ---------------------------- | ---------------------------- | ---------------------------- | -------------------------- | --------------------------------- | ------------------------- |
| STYLE_TRANSFER_CANDY | 450 | 600 | 750 | 1650 | 1800 |
| STYLE_TRANSFER_MOSAIC | 450 | 600 | 750 | 1650 | 1800 |
| STYLE_TRANSFER_UDNIE | 450 | 600 | 750 | 1650 | 1800 |
| STYLE_TRANSFER_RAIN_PRINCESS | 450 | 600 | 750 | 1650 | 1800 |

## LLMs

| Model | iPhone 16 Pro (XNNPACK) [tokens/s] | iPhone 13 Pro (XNNPACK) [tokens/s] | iPhone SE 3 (XNNPACK) [tokens/s] | Samsung Galaxy S24 (XNNPACK) [tokens/s] | OnePlus 12 (XNNPACK) [tokens/s] |
| --------------------- | ---------------------------------- | ---------------------------------- | -------------------------------- | --------------------------------------- | ------------------------------- |
| LLAMA3_2_1B | 16.1 | 11.4 | ❌ | 15.6 | 19.3 |
| LLAMA3_2_1B_SPINQUANT | 40.6 | 16.7 | 16.5 | 40.3 | 48.2 |
| LLAMA3_2_1B_QLORA | 31.8 | 11.4 | 11.2 | 37.3 | 44.4 |
| LLAMA3_2_3B | ❌ | ❌ | ❌ | ❌ | 7.1 |
| LLAMA3_2_3B_SPINQUANT | 17.2 | 8.2 | ❌ | 16.2 | 19.4 |
| LLAMA3_2_3B_QLORA | 14.5 | ❌ | ❌ | 14.8 | 18.1 |

❌ - Insufficient RAM.
36 changes: 36 additions & 0 deletions docs/docs/benchmarks/memory-usage.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
---
title: Memory Usage
sidebar_position: 2
---

## Classification

| Model | Android (XNNPACK) [MB] | iOS (Core ML) [MB] |
| ----------------- | ---------------------- | ------------------ |
| EFFICIENTNET_V2_S | 130 | 85 |

## Object Detection

| Model | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
| ------------------------------ | ---------------------- | ------------------ |
| SSDLITE_320_MOBILENET_V3_LARGE | 90 | 90 |

## Style Transfer

| Model | Android (XNNPACK) [MB] | iOS (Core ML) [MB] |
| ---------------------------- | ---------------------- | ------------------ |
| STYLE_TRANSFER_CANDY | 950 | 350 |
| STYLE_TRANSFER_MOSAIC | 950 | 350 |
| STYLE_TRANSFER_UDNIE | 950 | 350 |
| STYLE_TRANSFER_RAIN_PRINCESS | 950 | 350 |

## LLMs

| Model | Android (XNNPACK) [GB] | iOS (XNNPACK) [GB] |
| --------------------- | ---------------------- | ------------------ |
| LLAMA3_2_1B | 3.2 | 3.1 |
| LLAMA3_2_1B_SPINQUANT | 1.9 | 2 |
| LLAMA3_2_1B_QLORA | 2.2 | 2.5 |
| LLAMA3_2_3B | 7.1 | 7.3 |
| LLAMA3_2_3B_SPINQUANT | 3.7 | 3.8 |
| LLAMA3_2_3B_QLORA | 4 | 4.1 |
36 changes: 36 additions & 0 deletions docs/docs/benchmarks/model-size.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
---
title: Model Size
sidebar_position: 1
---

## Classification

| Model | XNNPACK [MB] | Core ML [MB] |
| ----------------- | ------------ | ------------ |
| EFFICIENTNET_V2_S | 85.6 | 43.9 |

## Object Detection

| Model | XNNPACK [MB] |
| ------------------------------ | ------------ |
| SSDLITE_320_MOBILENET_V3_LARGE | 13.9 |

## Style Transfer

| Model | XNNPACK [MB] | Core ML [MB] |
| ---------------------------- | ------------ | ------------ |
| STYLE_TRANSFER_CANDY | 6.78 | 5.22 |
| STYLE_TRANSFER_MOSAIC | 6.78 | 5.22 |
| STYLE_TRANSFER_UDNIE | 6.78 | 5.22 |
| STYLE_TRANSFER_RAIN_PRINCESS | 6.78 | 5.22 |

## LLMs

| Model | XNNPACK [GB] |
| --------------------- | ------------ |
| LLAMA3_2_1B | 2.47 |
| LLAMA3_2_1B_SPINQUANT | 1.14 |
| LLAMA3_2_1B_QLORA | 1.18 |
| LLAMA3_2_3B | 6.43 |
| LLAMA3_2_3B_SPINQUANT | 2.55 |
| LLAMA3_2_3B_QLORA | 2.65 |
Original file line number Diff line number Diff line change
Expand Up @@ -38,12 +38,13 @@ A string that specifies the location of the model binary. For more information,

### Returns

| Field | Type | Description |
| -------------- | ------------------------------------------------------------ | -------------------------------------------------------------------------------------------------------- |
| `forward` | `(input: string) => Promise<{ [category: string]: number }>` | Executes the model's forward pass, where `input` can be a fetchable resource or a Base64-encoded string. |
| `error` | <code>string &#124; null</code> | Contains the error message if the model failed to load. |
| `isGenerating` | `boolean` | Indicates whether the model is currently processing an inference. |
| `isReady` | `boolean` | Indicates whether the model has successfully loaded and is ready for inference. |
| Field | Type | Description |
| ------------------ | ------------------------------------------------------------ | -------------------------------------------------------------------------------------------------------- |
| `forward` | `(input: string) => Promise<{ [category: string]: number }>` | Executes the model's forward pass, where `input` can be a fetchable resource or a Base64-encoded string. |
| `error` | <code>string &#124; null</code> | Contains the error message if the model failed to load. |
| `isGenerating` | `boolean` | Indicates whether the model is currently processing an inference. |
| `isReady` | `boolean` | Indicates whether the model has successfully loaded and is ready for inference. |
| `downloadProgress` | `number` | Represents the download progress as a value between 0 and 1 |

## Running the model

Expand Down Expand Up @@ -86,3 +87,27 @@ function App() {
| Model | Number of classes | Class list |
| --------------------------------------------------------------------------------------------------------------- | ----------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| [efficientnet_v2_s](https://pytorch.org/vision/0.20/models/generated/torchvision.models.efficientnet_v2_s.html) | 1000 | [ImageNet1k_v1](https://github.com/software-mansion/react-native-executorch/blob/main/android/src/main/java/com/swmansion/rnexecutorch/models/classification/Constants.kt) |

## Benchmarks

### Model size

| Model | XNNPACK [MB] | Core ML [MB] |
| ----------------- | ------------ | ------------ |
| EFFICIENTNET_V2_S | 85.6 | 43.9 |

### Memory usage

| Model | Android (XNNPACK) [MB] | iOS (Core ML) [MB] |
| ----------------- | ---------------------- | ------------------ |
| EFFICIENTNET_V2_S | 130 | 85 |

### Inference time

:::warning warning
Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
:::

| Model | iPhone 16 Pro (Core ML) [ms] | iPhone 13 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
| ----------------- | ---------------------------- | ---------------------------- | -------------------------- | --------------------------------- | ------------------------- |
| EFFICIENTNET_V2_S | 100 | 120 | 130 | 180 | 170 |
Original file line number Diff line number Diff line change
Expand Up @@ -61,12 +61,13 @@ For more information on that topic, you can check out the [Loading models](https

The hook returns an object with the following properties:

| Field | Type | Description |
| -------------- | ----------------------------------------- | ---------------------------------------------------------------------------------------- |
| `forward` | `(input: string) => Promise<Detection[]>` | A function that accepts an image (url, b64) and returns an array of `Detection` objects. |
| `error` | <code>string &#124; null</code> | Contains the error message if the model loading failed. |
| `isGenerating` | `boolean` | Indicates whether the model is currently processing an inference. |
| `isReady` | `boolean` | Indicates whether the model has successfully loaded and is ready for inference. |
| Field | Type | Description |
| ------------------ | ----------------------------------------- | ---------------------------------------------------------------------------------------- |
| `forward` | `(input: string) => Promise<Detection[]>` | A function that accepts an image (url, b64) and returns an array of `Detection` objects. |
| `error` | <code>string &#124; null</code> | Contains the error message if the model loading failed. |
| `isGenerating` | `boolean` | Indicates whether the model is currently processing an inference. |
| `isReady` | `boolean` | Indicates whether the model has successfully loaded and is ready for inference. |
| `downloadProgress` | `number` | Represents the download progress as a value between 0 and 1 |

## Running the model

Expand Down Expand Up @@ -124,3 +125,27 @@ function App() {
| Model | Number of classes | Class list |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------- | --------------------------------------------------------------------------------------------------------------------------------------------------- |
| [SSDLite320 MobileNetV3 Large](https://pytorch.org/vision/main/models/generated/torchvision.models.detection.ssdlite320_mobilenet_v3_large.html#torchvision.models.detection.SSDLite320_MobileNet_V3_Large_Weights) | 91 | [COCO](https://github.com/software-mansion/react-native-executorch/blob/69802ee1ca161d9df00def1dabe014d36341cfa9/src/types/object_detection.ts#L14) |

## Benchmarks

### Model size

| Model | XNNPACK [MB] |
| ------------------------------ | ------------ |
| SSDLITE_320_MOBILENET_V3_LARGE | 13.9 |

### Memory usage

| Model | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
| ------------------------------ | ---------------------- | ------------------ |
| SSDLITE_320_MOBILENET_V3_LARGE | 90 | 90 |

### Inference time

:::warning warning
Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
:::

| Model | iPhone 16 Pro (XNNPACK) [ms] | iPhone 13 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
| ------------------------------ | ---------------------------- | ---------------------------- | -------------------------- | --------------------------------- | ------------------------- |
| SSDLITE_320_MOBILENET_V3_LARGE | 190 | 260 | 280 | 100 | 90 |
Loading