Config files

A main aim for the sunholo library is to have as much of the functionality needed for GenAI apps available via configuration files, rather than within the code.

This allows you to set up new instances of GenAI apps quickly, and experiment with new models, vectorstores and other features.

There are various config files available that control different features such as VAC behaviour and user access. This is very much still a work in progress so the format may change in the future.

Calling config files

Use the config functions within sunholo.utils to use the config files within your GenAI application. The most often used config is vacConfig below, which is called like this:

from sunholo.utils import ConfigManager

pirate_config = ConfigManager('pirate_speak')
llm = pirate_config.vacConfig('llm')
# 'openai'
agent = pirate_config.vacConfig('agent')
# 'langserve'

vector_name = 'eduvac'
eduvac_config = ConfigManager('eduvac')
llm = eduvac_config.vacConfig('llm')
# 'anthropic'
agent = eduvac_config.vacConfig('agent')
# 'eduvac'

You can call your config files anything, just make sure they are in the config/ folder relative to your working directory, or as configured via the VAC_CONFIG_FOLDER environment variable.

Config files in your local config/ folder will merge with the files within VAC_CONFIG_FOLDER which you can use to setup dev, test environments and to keep config files closer to your VAC code.

sunholo CLI

A CLI command is included to more easily inspect and validate configurations.

sunholo list-configs
#'## Config kind: promptConfig'
#{'apiVersion': 'v1',
# 'kind': 'promptConfig',
# 'prompts': {'eduvac': {'chat_summary': 'Summarise the conversation below:\n'
#                                        '# Chat History\n'
#                                        '{chat_history}\n'
#                                        '# End Chat History\n'
#                                        'If in the chat history is a lesson '
# ...                

sunholo list-configs --kind 'vacConfig'
## Config kind: vacConfig
#{'apiVersion': 'v1',
# 'kind': 'vacConfig',
# 'vac': {'codey': {'agent': 'edmonbrain_rag',
# ...

sunholo list-configs --kind=vacConfig --vac=edmonbrain           
## Config kind: vacConfig
#{'edmonbrain': {'agent': 'edmonbrain',
#                'avatar_url': 'https://avatars.githubusercontent.com/u/3155884?s=48&v=4',
#                'description': 'This is the original '
#                               '[Edmonbrain](https://code.markedmondson.me/running-llms-on-gcp/) '
#                               'implementation that uses RAG to answer '
#                               'questions based on data you send in via its '
# ...

# add the --validate flag to check the configuration against a schema
sunholo list-configs --kind=vacConfig --vac=edmonbrain --validate           
## Config kind: vacConfig
#{'edmonbrain': {'agent': 'edmonbrain',
#                'avatar_url': 'https://avatars.githubusercontent.com/u/3155884?s=48&v=4',
#                'description': 'This is the original '
#                               '[Edmonbrain](https://code.markedmondson.me/running-llms-on-gcp/) '
#                               'implementation that uses RAG to answer '
#                               'questions based on data you send in via its '
# ...
#Validating configuration for kind: vacConfig
#Validating vacConfig for edmonbrain
#OK: Validated schema

You can use the --validate flag in CI/CD to check the configuration each commit, for example in Cloud Build:

...
  - name: 'python:3.9'
    id: validate config
    entrypoint: 'bash'
    waitFor: ["-"]
    args:
    - '-c'
    - |
      pip install --no-cache sunholo
      sunholo list-configs --validate || exit 1

vacConfig

This is the main day to day configuration file that is used to set LLMs, databases and VAC tags. An example is shown here:

kind: vacConfig
apiVersion: v1
gcp_config: # reached via vac='global'
  project_id: default-gcp-project
  location: europe-west1
  endpoints_base_url: https://endpoints-xxxxx.a.run.app # if using Cloud Endpoints
vac:
  personal_llama:
    llm: vertex  # using google vertex
    model: gemini-1.5-pro-preview-0514 # models within google vertex
    agent: vertex-genai # using VAC created for Vertex
    display_name: Lots of Vertex AI features # for UI to the end user
    code_execution: true # to add code execution abilities
    grounding: # vertex only - add grounding
      google_search: true
    memory: # multiple memory allowed
      - discovery_engine_vertex_ai_search:
          vectorstore: vertex_ai_search # or 'discovery_engine'
      - llamaindex-native:
          vectorstore: llamaindex # only on vertex
          rag_id: 4611686018427387904 # generated via vertex RAG
      - agent_data_store:
          vectorstore: vector_ai_search # only on vertex
    gcp_config:
      project_id: multivac-internal-dev # default project
      location: us-central1   # default location
    chunker: # control chunking behaviour when sending data to llamaindex
      chunk_size: 1000
      overlap: 200
    pirate_speak:
        llm: openai
        agent: langserve
        #agent_url: you can specify manually your URL endpoint here, or on Multivac it will be populated automatically
        display_name: Pirate Speak
        tags: ["free"] # for user access, matches users_config.yaml
        avatar_url: https://avatars.githubusercontent.com/u/126733545?s=48&v=4
        description: A Langserve demo using a demo [Langchain Template](https://templates.langchain.com/) that will repeat back what you say but in a pirate accent.  Ooh argh me hearties!  Langchain templates cover many different GenAI use cases and all can be streamed to Multivac clients.
    eduvac:
        llm: anthropic
        model: claude-3-opus-20240229
        agent: eduvac # needs to match multivac service name
        agent_type: langserve # if you are using langserve instance for each VAC, you can specify its derived from langserve
        display_name: Edu-VAC
        tags: ["free"] # set to "eduvac" if you want to restrict usage to only users tagged "eduvac" in users_config.yaml
        avatar_url: ../public/eduvac.png
        description: Educate yourself in your own personal documents via guided learning from Eduvac, the ever patient teacher bot. Use search filters to examine available syllabus or upload your own documents to get started.
        upload:   # to accept uploads of private documents to a bucket
            mime_types: # pick which mime types got to which bucket
            - all
            buckets:
                all: your-bucket
        buckets: # pick which bucket takes default uploads
            raw: your-bucket
        docstore: # this needs to be valid to have document storage
            - alloydb-docstore: # you can have multiple doc stores
                type: alloydb
        alloydb_config: # example if using alloydb as your doc or vectorstore
            project_id: your-projectid
            region: europe-west1
            cluster: your-cluster
            instance: primary-instance-1
    csv_agent:
        llm: openai
        agent: langserve
        #agent_url: you can specify manually your URL endpoint here, or on Multivac it will be populated automatically
        display_name: Titanic
        tags: ["free"]
        avatar_url: https://avatars.githubusercontent.com/u/126733545?s=48&v=4
        description: A Langserve demo using a demo [Langchain Template](https://templates.langchain.com/) that lets you ask questions over structured data like a database.  In this case, a local database contains statistics from the Titanic disaster passengers.  Langchain templates cover many different GenAI use cases and all can be streamed to Multivac clients.
    rag_lance:
        llm: openai
        agent: langserve
        display_name: Simple RAG
        tags: ["free"]
        avatar_url: https://avatars.githubusercontent.com/u/126733545?s=48&v=4
        description: A Langserve demo using a demo [Langchain Template](https://templates.langchain.com/) that lets you ask questions over unstructured data.
        memory: # you can have multiple destinations for your embedding pipelines
            - lancedb-vectorstore:
                vectorstore: lancedb
                read_only: true # don't write embeddings to this vectorstore 
    finetuned_model:
        llm: model_garden # an example of a custom model such as Llama3 served by Vertex Model Garden
        agent: langserve
        tags: ["clientA"]
        gcp_config: # details of the Model Garden endpoint
            project_id: model_garden_project
            endpoint_id: 12345678
            location: europe-west1
    image_talk:
        llm: vertex
        model: gemini-1.0-pro-vision
        agent: langserve
        upload: # example of accepting uploads
            mime_types:
            - image
        display_name: Talk to Images
        tags: ["free"]
        avatar_url: https://avatars.githubusercontent.com/u/1342004?s=200&v=4
        description: A picture is worth a thousand words, so upload your picture and ask your question to the Gemini Pro Vision model.  Images are remembered for your conversation until you upload another.  This offers powerful applications, which you can get a feel for via the [Gemini Pro Vision docs](https://cloud.google.com/vertex-ai/docs/generative-ai/multimodal/design-multimodal-prompts) 
    sample_vector:
        llm: azure # using Azure OpenAI endpoints
        model: gpt-4-turbo-1106-preview
        agent: langserve
        display_name: Sample vector for tests
        avatar_url: https://avatars.githubusercontent.com/u/126733545?s=48&v=4
        description: An Azure OpenAI example
        memory: # you can have multiple vectorstore destinations
            - lancedb-vectorstore:
                vectorstore: lancedb
        embedder:
            llm: azure
        azure: # your azure details
            azure_openai_endpoint: https://openai-central-blah.openai.azure.com/
            openai_api_version: 2024-02-01
            embed_model: text-embedding-ada-002 # or text-embedding-3-large
    edmonbrain:
      llm: openai
      agent: edmonbrain
      display_name: Edmonbrain
      avatar_url: https://avatars.githubusercontent.com/u/3155884?s=48&v=4
      description: This is the original [Edmonbrain](https://code.markedmondson.me/running-llms-on-gcp/) implementation that uses RAG to answer questions based on data you send in via its `!help` commands and learns from previous chat history.  It dreams each night that can also be used in its memory.
      model: gpt-4o
      user_special_cmds: # allows commands that execute before a call to the model for user interaction
        - "!saveurl"
        - "!savethread"
      memory_k: 10 # how many memories will be returned in total after relevancy compression
      memory:
        - personal-vectorstore:
            vectorstore: lancedb
            k: 10 #  how many candidate memory will be returned from this vectorstore
        - eduvac-vectorstore:
            vector_name: eduvac
            read_only: true # can only read, not write embeddings
            vectorstore: lancedb
            k: 3 #  how many candidate memory will be returned from this vectorstore

agentConfig

This configuration file sets up standard endpoints for each type of agent, corresponding to a VAC running. It is also used to help create a Swagger specification for use when deploy to service such as Cloud Endpoints.

# this config file controls the behaviour of agent-types such as langserve, controlling what endpoints are used
kind: agentConfig
apiVersion: v2
agents:
  default:
    #post-noauth:
      # add post endpoints that do not need authentication
    #get-auth:
      # add get endpoints that do need authentication
    post:
      stream: "{stem}/vac/streaming/{vector_name}"
      invoke: "{stem}/vac/{vector_name}"
      openai: "{stem}/openai/v1/chat/completions"
      openai-vac: "{stem}/openai/v1/chat/completions/{vector_name}"
    get:
      home: "{stem}"
      health: "{stem}/health"
    response:
      invoke:
        '200':
          description: Successful invocation response
          schema:
            type: object
            properties:
              answer:
                type: string
              source_documents:
                type: array
                items:
                  type: object
                  properties:
                    page_content:
                      type: string
                    metadata:
                      type: string
      stream:
        '200':
          description: Successful stream response
          schema:
            type: string
      openai:
        '200':
          description: Successful OpenAI response
          schema:
            type: object
            properties:
              id:
                type: string
              object:
                type: string
              created:
                type: string
              model:
                type: string
              system_fingerprint:
                type: string
              choices:
                type: array
                items:
                  type: object
                  properties:
                    index:
                      type: integer
                    delta:
                      type: object
                      properties:
                        content:
                          type: string
                    logprobs:
                      type: string
                    finish_reason:
                      type: string
              usage:
                type: object
                properties:
                  prompt_tokens:
                    type: integer
                  completion_tokens:
                    type: integer
                  total_tokens:
                    type: integer
      openai-vac:
        '200':
          description: Successful OpenAI VAC response
          schema:
            type: object
            properties:
              id:
                type: string
              object:
                type: string
              created:
                type: string
              model:
                type: string
              system_fingerprint:
                type: string
              choices:
                type: array
                items:
                  type: object
                  properties:
                    index:
                      type: integer
                    message:
                      type: object
                      properties:
                        role:
                          type: string
                        content:
                          type: string
                    logprobs:
                      type: string
                    finish_reason:
                      type: string
              usage:
                type: object
                properties:
                  prompt_tokens:
                    type: integer
                  completion_tokens:
                    type: integer
                  total_tokens:
                    type: integer
      home:
        '200':
          description: OK
          schema:
            type: string
      health:
        '200':
          description: A healthy response
          schema:
            type: object
            properties:
              status:
                type: string
        '500':
          description: Unhealthy response
          schema:
            type: string

  eduvac:
    get:
      docs: "{stem}/docs"
    get-auth:
      playground: "{stem}/{vector_name}/playground"
    post:
      stream: "{stem}/{vector_name}/stream"
      invoke: "{stem}/{vector_name}/invoke"
      input_schema: "{stem}/{vector_name}/input_schema"
      output_schema: "{stem}/{vector_name}/output_schema"
      config_schema: "{stem}/{vector_name}/config_schema"
      batch: "{stem}/{vector_name}/batch"
      stream_log: "{stem}/{vector_name}/stream_log"

  langserve:
    get:
      docs: "{stem}/docs"
      playground: "{stem}/{vector_name}/playground"
    get-auth:
      playground: "{stem}/{vector_name}/playground"    
    post-noauth:
      # add post endpoints that do not need authentication
      output_schema: "{stem}/{vector_name}/output_schema"
    post:
      stream: "{stem}/{vector_name}/stream"
      invoke: "{stem}/{vector_name}/invoke"
      input_schema: "{stem}/{vector_name}/input_schema"
      config_schema: "{stem}/{vector_name}/config_schema"
      batch: "{stem}/{vector_name}/batch"
      stream_log: "{stem}/{vector_name}/stream_log"

userConfig

This lets you do user authentication by matching the tags within llm_config.yaml with user email domains

kind: userConfig
apiVersion: v1
user_groups:
  - name: "admin"
    domain: "sunholo.com"
    role: "ADMIN"
    tags:
      - "admin_user"

  - name: "eduvac"
    emails:
      - "multivac@sunholo.com"
    role: "eduvac"
    tags:
      - "eduvac"

  # Example of another firm using both domain and specific emails
  - name: "another_firm"
    domain: "anotherfirm.com"
    emails:
      - "specialcase@anotherfirm.com"
    role: "partner"
    tags:
      - "partner"

default_user:
  role: "USER"
  tags:
    - "user"

promptConfig

This file contains various prompts for a vector_name of a VAC. It is preferred that the native Langfuse prompt library is used, but this yaml file is a backup if its not available via Langfuse.

kind: promptConfig
apiVersion: v1
prompts:
  eduvac:
    intro: |
      You are an expert teacher versed with the latest techniques to enhance learning with your students.
      Todays date is {the_date}
      Please create an assignment for the student that will demonstrate their understanding of the text. 
    template: |
      Answer the question below with the help of the following context.  
      # Context
      {metadata}
      # End Context

      This is the conversation so far
      # Chat Summary
      ...{chat_summary}
      # Chat History
      ...{chat_history}
      # End of Chat History

      If you have made an earlier plan in your chat history, 
      briefly restate it and update where you are in that plan to make sure to 
      keep yourself on track and to not forget the original purpose of your answers.

      Question: {question}
      Your Answer:
    chat_summary: |
      Summarise the conversation below:
      # Chat History
      {chat_history}
      # End Chat History
      Your Summary of the chat history above:
    summarise_known_question: |
      You are an teacher assistant to a student and teacher who has has this input from the student:
      {question}

      # Chat history (teacher and student)
      {chat_history}
      # End Chat History

      # Context (what the student is learning)
      {context}
      # end context
      Assess if the student has completed the latest tasks set by the teacher, 
      with recommendations on what the student and teacher should do next. 


      Your Summary:

Calling config files​

sunholo CLI​

vacConfig​

agentConfig​

userConfig​