Github connector

This connector uses the `Github repository reader by llama_index.

Based on the returned list of documents you can upload it to the platform.

Create a yaml file under the config folder with the following parameters, let's assume github_sandbox.yaml; contact your provider for some of these values if needed:

github:
  api_token: !!str 'string' # or GITHUB_TOKEN environment variable
  base_url: !!str 'https://api.github.com'
  api_version: !!str 'string' # defaults to 2022-11-28
  verbose: !!bool true|false (default) # Whether to print verbose messages
  owner: !!str 'string' # Owner of the repository
  repo: !!str 'string' # Name of the repository
  use_parser: !!bool true|false (default) # Whether to use the parser to extract text from files
  filter_directories_filter: INCLUDE (default) | EXCLUDE
  filter_directories: # List of filters
    - !!str 'filter 1'
    - !!str 'filter 2'
  filter_file_extensions_filter: INCLUDE (default) | EXCLUDE
  filter_file_extensions:  # List of extensions
    - !!str '.extension 1'
    - !!str '.extension 2'
  concurrent_requests: !!int integer
  branch: !!str 'main' # exclusive with commit_sha
  commit_sha: !!str ''
  namespace: !!str 'namespace name' # Must match the associated RAG assistant, check the index section
saia:
  base_url: !!str 'string' # Globant Enterprise AI Base URL
  api_token: !!str 'string'
  profile: !!str 'string' # Must match the RAG assistant ID
  max_parallel_executions: !!int 5
  upload_operation_log: !!bool False|True (default) # Check operations LOG for detail if enabled
# Deprecated
vectorstore:
  api_key: !!str 'check with the provider'
  index_name: !!str 'check with the provider'
embeddings:
  openapi_key: !!str 'check with the provider' # Or use your own
  chunk_size: !!int integer # DefaultVectorStore.CHUNK_SIZE by default
  chunk_overlap: !!int integer # DefaultVectorStore.CHUNK_OVERLAP by default
  model: !!str name # defaults to text-embedding-ada-002

Execution

Example execution:

saia-cli ingest -c ./config/github_sandbox.yaml --type github

Expected output is similar to:

INFO:root:Successfully github ingestion 'timestamp' config: <path_to_config.yaml>

Use the verbose parameter to get detail of the processing steps.

Tip: under the debug folder, the {provider}_YYYYMMDDHHMMSS.json is the result of the issues ingestion and can be uploaded to any RAG assistant if you use the .custom extension when uploading the file.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

github_config.md

github_config.md

Github connector

Execution

Files

github_config.md

Latest commit

History

github_config.md

File metadata and controls

Github connector

Execution