Embedding Function Templates for unstructured text data

IBM watsonx.governance users need to pass these custom embedding functions as an input while generating embeddings for a subscription via the notebook. This page has some templates of score functions that can be used for reference.

Input to embedding function

The input to the embedding function has to be a list of strings.

Output to embedding function

The output of the embedding function has to be a list of embedding vectors (floats)
The size of the output list needs to be same as the size of the input list.

IBM watsonx.ai

Make sure to install the ibm-watsonx-ai package.
User needs to add API_KEY of their account.
The embeddings functionality of watsonx.ai works within the scope of a project or a space. The example below asks for a PROJECT_ID.
The example below uses sentence-transformers/all-minilm-l12-v2 to generate embeddings. Please check the list of supported embedding models in watsonx.ai documentation. A list is also available using the watsonx.ai client client.foundation_models.EmbeddingModels.show()

def embeddings_fn(inputs):
    from ibm_watsonx_ai import Credentials, APIClient
    from ibm_watsonx_ai.foundation_models import Embeddings
    from ibm_watsonx_ai.metanames import EmbedTextParamsMetaNames 
    
    # from time import time
    # start_time = time()

    API_KEY = "TO BE EDITED"
    WX_URL = "https://us-south.ml.cloud.ibm.com"
    PROJECT_ID = "TO BE EDITED"

    credentials = Credentials(
        url = WX_URL,
        api_key = API_KEY
    )

    client = APIClient(credentials, project_id=PROJECT_ID)
    # client.foundation_models.EmbeddingModels.show()
    embedding = Embeddings(
        model_id=client.foundation_models.EmbeddingModels.ALL_MINILM_L12_V2,
        api_client=client,
        params={
            EmbedTextParamsMetaNames.TRUNCATE_INPUT_TOKENS: 128
        }
    )
    result = embedding.embed_documents(texts=inputs)
    # print(f"Got embeddings of {len(inputs)} inputs in {time() - start_time}s.")
    return result

Sentence Transformers' Library

The example below uses sentence-transformers/all-minilm-l12-v2 to generate embeddings. Please check the list of supported embedding models.

from sentence_transformers import SentenceTransformer

# 1. Load a pretrained Sentence Transformer model
model = SentenceTransformer("all-MiniLM-L12-v2")

 # 2. Calculate embeddings by calling model.encode()
embeddings_fn = model.encode

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Embedding Function Templates for unstructured text data

Embedding Function Templates for unstructured text data

Input to embedding function

Output to embedding function

IBM watsonx.ai

Sentence Transformers' Library

Watson Openscale a.k.a WatsonX Governance Samples

Score function templates

Embedding function templates

Miscellaneous

Clone this wiki locally