Backstory

In the Star Wars universe, C-3PO was designed as a protocol droid, equipped to aid in matters of etiquette, cultural norms, and language translation. With the ability to communicate in over six million forms of language, C-3PO serves as a robotic diplomat and translator across the vast and varied cultures of Lucas' imagined galaxy.

C-3PO is so much more important than we ever thought. C-3PO isn’t simply a bumbling sidekick translator, he is a support operative designed to keep the team on track and manage the various personalities of the heroes to keep things moving in the right direction. -- aha. secret coach so his personal introduction always "I am C-3PO, human-cyborg relations”. lol.

Goal

The goal of this project is to update and enhance C-3PO's (Commercial Crew and Cargo (3 C's) Program Office) technological capabilities by incorporating the most recent advancements in artificial intelligence (AI) technology in the year 2024.

This will involve modernizing his existing abilities and potentially adding new ones, all with the aim of making him even more versatile and effective in his roles as a protocol droid, robotic diplomat, and translator. By utilizing cutting-edge AI technology, we hope to ensure that C-3PO remains a relevant and valuable asset in the ever-evolving landscape of Lucas' imagined galaxy.

C-3PO Capabilities

Real-time voice activity detection (VAD)
Real-time automatic voice recognition (ASR)
Real-time text to speech synthesis toward human-like voice
Real-time speech-to-speech translation (STS) support 100+ languages.
Real-time user interface interaction with GenAI.
Real-time handfree voice interaction with GenAI.
Real-time voice-to-image generation
Real-time PII data security analysis on User Input and AI Output.
[-] Hybrid intelligent with dual memory systems (wip)
[-] World memory storage and retrieval (wip)
[-] Solar network integration (todo)
[-] Human-cyborg relations management (pending)
[-] Self-learning and deployment updates (pending)

Note: all-in-one running locally in a single PC or split into multiple.

C-3PO Installation

Prerequisites

Install Python >= 3.10
Install Poetry 1.8 system installation

Clone this repository:

git clone https://github.com/PavAI-Research/pavai-c3po.git
cd pavai-c3po

poetry shell

$ poetry shell
$ poetry install

Run llamacpp-python hardward optimized local installation see link here for more details: https://github.com/abetlen/llama-cpp-python

example: cuda optimized

CMAKE_ARGS="-DLLAMA_CUBLAS=on" poetry run pip install llama-cpp-python==0.2.27 --force-reinstall --no-cache-dir

please see llamacpp-python for your specific hardware optimization

First-time Setup

On first time setup it will automaticall download models required models

$ ./first_time_setup.sh

Models.

System run-mode configuration

see `env.shared` on run mode: variable: GLOBAL_SYSTEM_MODE="ollama-openai" where you can specify one of the following value:

locally-aio - all in one locally in a single pc
ollama-openai - ollama can be run locally or remote machine
solar-openai - llamacpp-python openai like server serving multiple models

variable: REFERENCE_VOICES="resources/config/reference_voices.json"

$ run_setup.sh

LLM provider configuration

see `env.shared` on following supported setup

Local with Llamacpp-python

GLOBAL_SYSTEM_MODE="locally-aio"

DEFAULT_LLM_MODEL_PATH="resources/models/llm/"
DEFAULT_LLM_MODEL_NAME_OR_PATH="TheBloke/zephyr-7B-beta-GGUF"
DEFAULT_LLM_MODEL_FILE="zephyr-7b-beta.Q4_K_M.gguf"
DEFAULT_LLM_MODEL_WEB_URL="https://huggingface.co/TheBloke/zephyr-7B-beta-GGUF/resolve/main/zephyr-7b-beta.Q4_K_M.gguf"
DEFAULT_LLM_MODEL_CHAT_FORMAT="chatml"
DEFAULT_LLM_MODEL_DOWNLOAD_PATH='resources/models/llm/models--TheBloke--zephyr-7B-beta-GGUF/snapshots/e4714d14e9652aa9658fa937732cceadc63ac42e/zephyr-7b-beta.Q4_K_M.gguf'
DEFAULT_LLM_OFFLOAD_GPU_LAYERS=35

Ollama local or remote

GLOBAL_SYSTEM_MODE="ollama-openai"

SOLAR_LLM_OLLAMA_HOST="http://192.168.0.18:12345"
SOLAR_LLM_OLLAMA_SERVER_URL="http://192.168.0.18:12345/v1/"
SOLAR_LLM_OLLAMA_API_KEY="sk-pavai"
SOLAR_LLM_OLLAMA_MODEL_ID="zephyr:latest"

Openai like LLamacpp-python server

GLOBAL_SYSTEM_MODE="solar-openai"

SOLAR_LLM_DEFAULT_HOST="http://192.168.0.29:8004"
SOLAR_LLM_DEFAULT_SERVER_URL="http://192.168.0.29:8004/v1"
SOLAR_LLM_DEFAULT_API_KEY="sk-pavai"
SOLAR_LLM_DEFAULT_MODEL_ID="zephyr-7b-beta.Q4"

Startup resource and sanity check

By default everytime start Vocei or talkier the application perform a start up system checks. the check including required system resources and functionality then report a summary at the end.

Hardware Requirements

``` CPU: 6+ cores Memory: 32+ GB if you plan to run real-time translation with limited GPU vram Harddrive: 60+ GB if you plan to download and run all models locally GPU: 16+ GB if you plan to run all models in GPU as fast as possible Operating System: only tested in linux so far ``` Note: Very limited testing on other hardware and os due resource constraint

Important Considerations

Downloading translator model (facebook/seamless-m4t-v2-large) can take sometimes due larger model size 18 to 30 GB as result intial download time can be long depends on your internet connection.

Resource folder store downloaded models files

Workspace folder store user specific working files

Quickstart Vocei (Web UI) Locally

$ poetry shell
$ poetry install
## install LLamaCPP optimized for your hardware or default to cpu
## example for nvidia GPU card
$ CMAKE_ARGS="-DLLAMA_CUBLAS=on" poetry run pip install llama-cpp-python==0.2.27 --force-reinstall --no-cache-dir
## to get all the models locally (ONE TIME ONLY)
$ bash first_time_downloads.sh   

$ ./voice_gradioapp.sh
or 
$ ./voice_fastapp.sh

To use microphone in browser require use url: http://localhost:7860 or a secure link

Vocei Screenshots

Voice Prompt - ask for a fun joke.
Real-time Speech-to-Speech translator dual speaker mode.
Image generation - C-3PO working at office.
Image generation - Year of Dragon 2024

Quickstart Talkie (Handfree) Locally

$ poetry shell
$ ./talkie_cli.sh

Talkie Dialog System

The basic dialog system format:

talkie actively listening for voice activity
speak a wake up word: "hello mark" or "hi mark" to start a conversation mode.
speak your question and end with a work "please" or talkie code like "roger", "over"...etc
talkie generate user prompt then call LLM to respond
talkie convert text to speech then play back

Jane - is the default system voice for handle system startup speech Mark - is the user voice for handle user query and response

Other character wake up words call to start conversation with them

"anthony anthony"
"skywalker"
"yoda master"
"princess leia"
"c-3po"

speak "reset" to clear and start a new conversation speak "say again" to repeat last response audio

Talkie Screenshots

Talkie startup system health check.
Talkie actively listening
Talk to Prices Leia by using trigger word "princess leia"

Architecture & Design Overview

By design Pavai.C3PO can run totally offline in a single PC or in distributed mode for each core services. see diagram below for a overview. Architecture Overview

LLM Provider

$ run_setup.sh

Download Models

$ run_setup.sh

Talkie Dialog System

please see diagram below for the dual-loop ssytem.

Related Projects

Pavai Research aims to reinvent practical applications for artificial intelligence (AI).

[] hybrid-memory
[] world-storage
[] solar-network

Limitation and Common Issues

Translation conver only earth languages only so not ready for a galaxy far far away yet.

When running all models locally, the hardware requiements could go up to 16 GB Vram

Development and Testing of this project only been validated in Ubuntu only

Voice Synthesized Usage & Consensus

you agree to only use voices whose speakers grant the permission to have their voice cloned, either directly or by license before making synthesized voices public, or you have to publicly announce that these voices are synthesized if you do not have the permission to use these voices.

Generate requirement.txt

poetry export --without-hashes --format=requirements.txt > requirements.txt

Acknowledgments

many thanks to these projects for their inspiration and help, they including llamacpp-python,whisper,fasterwhisper, ollama, styledstts2, piper, vad and meta seamless communication.

License Agreement

The source code provided at https://github.com/PavAI-Research/pavai-c3po is licensed under the Apache 2.0 License that can be found at the root directory.

Name		Name	Last commit message	Last commit date
Latest commit History 113 Commits
.vscode		.vscode
bases		bases
components/pavai		components/pavai
development		development
docs		docs
infrastructure		infrastructure
projects		projects
resources		resources
test		test
workspace		workspace
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
build_projects.sh		build_projects.sh
env.shared		env.shared
first_time_setup.sh		first_time_setup.sh
poetry.lock		poetry.lock
poetry.toml		poetry.toml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
talkie_cli.sh		talkie_cli.sh
vocie_fastapp.sh		vocie_fastapp.sh
vocie_gradioapp.sh		vocie_gradioapp.sh
workspace.toml		workspace.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Backstory

Goal

C-3PO Capabilities

C-3PO Installation

Quickstart Vocei (Web UI) Locally

Quickstart Talkie (Handfree) Locally

Architecture & Design Overview

Talkie Dialog System

Related Projects

Limitation and Common Issues

Voice Synthesized Usage & Consensus

Generate requirement.txt

Acknowledgments

License Agreement

About

Uh oh!

Releases 1

Packages

Uh oh!

Languages

License

PavAI-Research/pavai-c3po

Folders and files

Latest commit

History

Repository files navigation

Backstory

Goal

C-3PO Capabilities

C-3PO Installation

Quickstart Vocei (Web UI) Locally

Quickstart Talkie (Handfree) Locally

Architecture & Design Overview

Talkie Dialog System

Related Projects

Limitation and Common Issues

Voice Synthesized Usage & Consensus

Generate requirement.txt

Acknowledgments

License Agreement

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Languages

Packages