I recommend you to check out the previous parts if you don’t know what this “neural narratives” thing is about. In short, I wrote in Python a system to have multi-character conversations with large language models (like Llama 3.1), in which the characters are isolated in terms of memories and bios, so no leakage to other participants. Here’s the GitHub repo, that now even features a proper readme.
In the last episode of this thing, our suave protagonist, Japanese teenager Takumi Arai, thanked the irritated half-humanoid, half-scorpion guardian for her help, then set off along with gender-ambiguous Sandstrider Kael Marrek back to the desert sun, to figure out how to make money in this new world.
That’s all for today, I’m afraid, because I had to do a major restructuring of my app. As I was adding a fact to the playthrough (facts being any more-or-less objective notions that the characters know about their reality), I started thinking about scalability. All the facts introduced relate to this deserty part of the fantasy world, and they would be generally useless if the protagonist were to travel somewhere else. However, all the prompts that involve facts grab them from the corresponding text file, so the more facts the user adds, the more it fills the limited context window that the large language models have to work with, potentially with unrelated stuff. How to solve this?
Well, I knew what used to be the best idea for how to solve the issue: vector databases. They are a fancy way of decomposing text into multidimensional vectors of floating numbers. When you query that database with any text, the query gets decomposed into vectors. Then, the distance of those vectors to the vectors stored in the database gets calculated, and the database returns the closest vectors. Those closest vectors happen to be the semantically closest data stored in the database. That’s the hard way of saying that when you ask a vector database a question, it returns the contents that are more closely related to the question. It’s almost like magic. It doesn’t search for specific keywords exactly; if you query it with the word “desert,” it may return stuff that involves the word “oasis,” “camel,” “sun,” etc. If I implemented this into my app, the descriptions of the places, some character info, etc. would be sent as the query to the database, and the corresponding facts or character memories would get returned, up to an arbitrary limit of results. It fixes all the problems.
The issue is implementing such a thing. The last time I attempted it, a couple of years ago, it was a mess, and never got it to work as I had expected. After interviewing OpenAI’s Orion preview model for a bit, it turns out that last time I may have picked the worst Python library to work with vector databases, or else many advances have been made since then. This time I chose the chromadb library, specialized in working with large language models. Implementing the database turned out to be very intuitive. Here’s the entire code of that implementation:
from enum import Enum
from typing import List, Optional, Dict, Any
import chromadb
from chromadb.api.types import IncludeEnum # noqa
from chromadb.config import Settings
from chromadb.utils import embedding_functions
from src.base.validators import validate_non_empty_string
from src.databases.abstracts.database import Database
from src.filesystem.path_manager import PathManager
class ChromaDbDatabase(Database):
class DataType(Enum):
CHARACTER_IDENTIFIER = "character_identifier"
FACT = "fact"
MEMORY = "memory"
def __init__(
self, playthrough_name: str, path_manager: Optional[PathManager] = None
):
validate_non_empty_string(playthrough_name, "playthrough_name")
self._path_manager = path_manager or PathManager()
# Initialize Chroma client with per-playthrough persistent storage.
self._chroma_client = chromadb.PersistentClient(
path=self._path_manager.get_database_path(playthrough_name).as_posix(),
settings=Settings(anonymized_telemetry=False, allow_reset=True),
)
# Use a single collection for all data types within the playthrough
self._collection = self._chroma_client.get_or_create_collection(
name="playthrough_data"
)
self._embedding_function = embedding_functions.DefaultEmbeddingFunction()
def _determine_where_clause(
self, data_type: str, character_identifier: Optional[str] = None
) -> Dict[str, Any]:
where_clause = {"type": data_type}
if character_identifier:
# Must use the "$and" operator.
where_clause = {
"$and": [
where_clause,
{self.DataType.CHARACTER_IDENTIFIER.value: character_identifier},
]
}
return where_clause
def _insert_data(
self, text: str, data_type: str, character_identifier: Optional[str] = None
):
data_id = str(self._collection.count())
metadata = {"type": data_type}
if character_identifier:
metadata[self.DataType.CHARACTER_IDENTIFIER.value] = character_identifier
# Upsert updates existing items, or adds them if they don't exist.
# If an id is not present in the collection, the corresponding items will
# be created as per add. Items with existing ids will be updated as per update.
self._collection.upsert(
ids=[data_id],
documents=[text],
embeddings=self._embedding_function([text]),
metadatas=[metadata],
)
def _retrieve_data(
self,
query_text: str,
data_type: str,
character_identifier: Optional[str] = None,
top_k: int = 5,
) -> List[str]:
results = self._collection.query(
query_embeddings=self._embedding_function([query_text]),
n_results=top_k,
where=self._determine_where_clause(data_type, character_identifier),
include=[IncludeEnum.documents],
)
return results["documents"][0] if results["documents"] else []
def insert_fact(self, fact: str) -> None:
self._insert_data(fact, data_type=self.DataType.FACT.value)
def insert_memory(self, character_identifier: str, memory: str) -> None:
self._insert_data(
memory,
data_type=self.DataType.MEMORY.value,
character_identifier=character_identifier,
)
def retrieve_facts(self, query_text: str, top_k: int = 5) -> List[str]:
return self._retrieve_data(
query_text, data_type=self.DataType.FACT.value, top_k=top_k
)
def retrieve_memories(
self, character_identifier: str, query_text: str, top_k: int = 5
) -> List[str]:
return self._retrieve_data(
query_text,
data_type=self.DataType.MEMORY.value,
character_identifier=character_identifier,
top_k=top_k,
)
Obviously, I had to hunt down every previous reference to facts and memories so that they no longer rely on plain text files, but instead insert every relevant data into or query it from the database. I got everything working seamlessly. As of today, I have 527 tests in total, but the app has grown to such a size that it doesn’t surprise me when it starts creaking from any nook, which I usually hurry to pin in place with a test. I rely on OpenAI’s Orion models exclusively to write those tests, as they are annoying to set up, and eat up development time, even though the tests themselves are invaluable to ensure everything works as needed.
I’m an obsessive dude in general, and so is the case with my code. If I need to produce some data, I write a Provider or an Algorithm class, which are then created through Factories. Non-returning operations are encapsulated in Commands, which can be linked together like lego pieces. It’s all very aesthetically pleasing, if you’re a programmer at least. The weakest link are the Flask views, which are probably hard to test as they’re the endpoints, but I haven’t tried to do so, because I tend to move complicated, non-instantiating code to isolated modules. The instantiation gets done as close to the endpoint as possible, or else with Composer classes. All the instantiations get passed to further classes through Dependency Injection. Code quality, baby.
I think I’ve mentioned it before, but I got into creating this app because I wanted to involve artificial intelligence in my smut sessions. As it often happens, technological development is driven by men’s need to have increasingly better orgasms. Can’t wait for the sexbots.







You must be logged in to post a comment.