Neural narratives in Python #31

I recommend you to check out the previous parts if you don’t know what this “neural narratives” thing is about. In short, I wrote in Python a system to have multi-character conversations with large language models (like Llama 3.1), in which the characters are isolated in terms of memories and bios, so no leakage to other participants. Here’s the GitHub repo, that now even features a proper readme.

In the last episode of this thing, our suave protagonist, Japanese teenager Takumi Arai, thanked the irritated half-humanoid, half-scorpion guardian for her help, then set off along with gender-ambiguous Sandstrider Kael Marrek back to the desert sun, to figure out how to make money in this new world.

That’s all for today, I’m afraid, because I had to do a major restructuring of my app. As I was adding a fact to the playthrough (facts being any more-or-less objective notions that the characters know about their reality), I started thinking about scalability. All the facts introduced relate to this deserty part of the fantasy world, and they would be generally useless if the protagonist were to travel somewhere else. However, all the prompts that involve facts grab them from the corresponding text file, so the more facts the user adds, the more it fills the limited context window that the large language models have to work with, potentially with unrelated stuff. How to solve this?

Well, I knew what used to be the best idea for how to solve the issue: vector databases. They are a fancy way of decomposing text into multidimensional vectors of floating numbers. When you query that database with any text, the query gets decomposed into vectors. Then, the distance of those vectors to the vectors stored in the database gets calculated, and the database returns the closest vectors. Those closest vectors happen to be the semantically closest data stored in the database. That’s the hard way of saying that when you ask a vector database a question, it returns the contents that are more closely related to the question. It’s almost like magic. It doesn’t search for specific keywords exactly; if you query it with the word “desert,” it may return stuff that involves the word “oasis,” “camel,” “sun,” etc. If I implemented this into my app, the descriptions of the places, some character info, etc. would be sent as the query to the database, and the corresponding facts or character memories would get returned, up to an arbitrary limit of results. It fixes all the problems.

The issue is implementing such a thing. The last time I attempted it, a couple of years ago, it was a mess, and never got it to work as I had expected. After interviewing OpenAI’s Orion preview model for a bit, it turns out that last time I may have picked the worst Python library to work with vector databases, or else many advances have been made since then. This time I chose the chromadb library, specialized in working with large language models. Implementing the database turned out to be very intuitive. Here’s the entire code of that implementation:

from enum import Enum
from typing import List, Optional, Dict, Any

import chromadb
from chromadb.api.types import IncludeEnum  # noqa
from chromadb.config import Settings
from chromadb.utils import embedding_functions

from src.base.validators import validate_non_empty_string
from src.databases.abstracts.database import Database
from src.filesystem.path_manager import PathManager


class ChromaDbDatabase(Database):

    class DataType(Enum):
        CHARACTER_IDENTIFIER = "character_identifier"
        FACT = "fact"
        MEMORY = "memory"

    def __init__(
        self, playthrough_name: str, path_manager: Optional[PathManager] = None
    ):
        validate_non_empty_string(playthrough_name, "playthrough_name")

        self._path_manager = path_manager or PathManager()

        # Initialize Chroma client with per-playthrough persistent storage.
        self._chroma_client = chromadb.PersistentClient(
            path=self._path_manager.get_database_path(playthrough_name).as_posix(),
            settings=Settings(anonymized_telemetry=False, allow_reset=True),
        )

        # Use a single collection for all data types within the playthrough
        self._collection = self._chroma_client.get_or_create_collection(
            name="playthrough_data"
        )

        self._embedding_function = embedding_functions.DefaultEmbeddingFunction()

    def _determine_where_clause(
        self, data_type: str, character_identifier: Optional[str] = None
    ) -> Dict[str, Any]:
        where_clause = {"type": data_type}
        if character_identifier:
            # Must use the "$and" operator.
            where_clause = {
                "$and": [
                    where_clause,
                    {self.DataType.CHARACTER_IDENTIFIER.value: character_identifier},
                ]
            }

        return where_clause

    def _insert_data(
        self, text: str, data_type: str, character_identifier: Optional[str] = None
    ):
        data_id = str(self._collection.count())
        metadata = {"type": data_type}
        if character_identifier:
            metadata[self.DataType.CHARACTER_IDENTIFIER.value] = character_identifier

        # Upsert updates existing items, or adds them if they don't exist.
        # If an id is not present in the collection, the corresponding items will
        # be created as per add. Items with existing ids will be updated as per update.
        self._collection.upsert(
            ids=[data_id],
            documents=[text],
            embeddings=self._embedding_function([text]),
            metadatas=[metadata],
        )

    def _retrieve_data(
        self,
        query_text: str,
        data_type: str,
        character_identifier: Optional[str] = None,
        top_k: int = 5,
    ) -> List[str]:
        results = self._collection.query(
            query_embeddings=self._embedding_function([query_text]),
            n_results=top_k,
            where=self._determine_where_clause(data_type, character_identifier),
            include=[IncludeEnum.documents],
        )

        return results["documents"][0] if results["documents"] else []

    def insert_fact(self, fact: str) -> None:
        self._insert_data(fact, data_type=self.DataType.FACT.value)

    def insert_memory(self, character_identifier: str, memory: str) -> None:
        self._insert_data(
            memory,
            data_type=self.DataType.MEMORY.value,
            character_identifier=character_identifier,
        )

    def retrieve_facts(self, query_text: str, top_k: int = 5) -> List[str]:
        return self._retrieve_data(
            query_text, data_type=self.DataType.FACT.value, top_k=top_k
        )

    def retrieve_memories(
        self, character_identifier: str, query_text: str, top_k: int = 5
    ) -> List[str]:
        return self._retrieve_data(
            query_text,
            data_type=self.DataType.MEMORY.value,
            character_identifier=character_identifier,
            top_k=top_k,
        )

Obviously, I had to hunt down every previous reference to facts and memories so that they no longer rely on plain text files, but instead insert every relevant data into or query it from the database. I got everything working seamlessly. As of today, I have 527 tests in total, but the app has grown to such a size that it doesn’t surprise me when it starts creaking from any nook, which I usually hurry to pin in place with a test. I rely on OpenAI’s Orion models exclusively to write those tests, as they are annoying to set up, and eat up development time, even though the tests themselves are invaluable to ensure everything works as needed.

I’m an obsessive dude in general, and so is the case with my code. If I need to produce some data, I write a Provider or an Algorithm class, which are then created through Factories. Non-returning operations are encapsulated in Commands, which can be linked together like lego pieces. It’s all very aesthetically pleasing, if you’re a programmer at least. The weakest link are the Flask views, which are probably hard to test as they’re the endpoints, but I haven’t tried to do so, because I tend to move complicated, non-instantiating code to isolated modules. The instantiation gets done as close to the endpoint as possible, or else with Composer classes. All the instantiations get passed to further classes through Dependency Injection. Code quality, baby.

I think I’ve mentioned it before, but I got into creating this app because I wanted to involve artificial intelligence in my smut sessions. As it often happens, technological development is driven by men’s need to have increasingly better orgasms. Can’t wait for the sexbots.

Neural narratives in Python #30

I recommend you to check out the previous parts if you don’t know what this “neural narratives” thing is about. In short, I wrote in Python a system to have multi-character conversations with large language models (like Llama 3.1), in which the characters are isolated in terms of memories and bios, so no leakage to other participants. Here’s the GitHub repo, that now even features a proper readme.

In the last part, our protagonist, having been sent by a ditzy goddess into a scorching desert world, or at least a deserty part of a fantasy world, deals with an imposing half-person, half-scorpion guardian, who offers him sanctuary in their safe house as long as the protagonist passes an initiation rite.

That was one of the funnest interactions I’ve had through this app. I’ve got a soft spot for that incompetent goddess. And the scene ends with the driving lesson of isekai: sometimes we must lose one world entirely to find our true place in another.

Although a week ago I programmed the ability for the user to add participants to an ongoing dialogue, I hadn’t programmed the feature to remove participants from one. It was necessary to do so given the circumstances; otherwise, the AI might have chosen to speak as Seraphina even though she was supposed to be gone. In addition, when a dialogue ends, a summary is generated and added as a memory to the participants. In the case of the participants leaving mid-conversation, it wouldn’t make sense to know what happened after they left, so now, for each character leaving mid-convo, the summary of the dialogue up to that point is added to their memories.

My app has a section called Story Hub that allows the user to generate story concepts, to help them figure out where the story may be going. They could already generate plot blueprints, scenarios, goals, dilemmas, and plot twists. Thanks to the massive refactoring I did of the whole width of story concepts in the app, adding new ones was easy.

I’ve also involved the facts added by the player in many prompts to the AI, including dialogue. Facts are supposed to represent well-known information about the world, such as legends, properties of animals or sentient races, etc. For example, one of the generated pieces of lore named the twin moons of this world, so I added that information to the facts. My biggest worry is the context window of some large language models: my favorite right now, Magnum 72B, has a tiny context of 16,000 tokens, and the more you add to memories and facts, the more they eat of the context, until you’re forced to switch to a subpar model.

That’s all for now. Stay whimsical.

Neural narratives in Python #29

I recommend you to check out the previous parts if you don’t know what this “neural narratives” thing is about. In short, I wrote in Python a system to have multi-character conversations with large language models (like Llama 3.1), in which the characters are isolated in terms of memories and bios, so no leakage to other participants. Here’s the GitHub repo, that now even features a proper readme.

In the previous part, a Sandstrider named Kael Marrek gave our hapless, reincarnated protagonist a tour of the local market, providing basic advice so the protagonist doesn’t die the first night. Kael guided him to a sanctuary where many of the local displaced take shelter.

The next scene, taking place in an initiation chamber, was probably my favorite of all the interactions I’ve had in this app of mine. I’ll post it tomorrow.

Neural narratives in Python #28

I recommend you to check out the previous parts if you don’t know what this “neural narratives” thing is about. In short, I wrote in Python a system to have multi-character conversations with large language models (like Llama 3.1), in which the characters are isolated in terms of memories and bios, so no leakage to other participants. Here’s the GitHub repo, that now even features a proper readme.

In the previous and first part of this new tale, our protagonist, Japanese teenager Takumi Arai, fucking died, but a ditzy goddess presented him to the wonders of reincarnation. Now, Takumi finds himself in an unknown city, retaining his previous form and memories but not knowing anything about this world where he has ended up.

Takumi was lucky enough to get across a reasonable outcast like Kael Marrek, of indeterminate gender. He or she gives Takumi a tour of the teeming market.

Neural narratives in Python #27

I recommend you to check out the previous parts if you don’t know what this “neural narratives” thing is about. In short, I wrote in Python a system to have multi-character conversations with large language models (like Llama 3.1), in which the characters are isolated in terms of memories and bios, so no leakage to other participants. Here’s the GitHub repo, that now even features a proper readme.

The previous part saw the ending of the cosmic horror tale I was telling. This one will see the beginning of the silly isekai thing I’ll do next in my AI-fueled app.

Here’s our suave protagonist, Japanese teenager Takumi Arai:

He lives in quite the peculiar story universe, but I’ll let you discover it. The story starts with him being visited by a certain interdimensional legend.

That’s the end of Takumi Arai. But in this story universe, a visit from Truck-kun isn’t the end. And yes, I went through the trouble of creating a particular world, region, and area for the original world, as well as Truck-kun himself, even though I may never revisit them.

Well, that was one of the most chaotic interactions I’ve had on this app.

Neural narratives in Python #26

I recommend you to check out the previous parts if you don’t know what this “neural narratives” thing is about. In short, I wrote in Python a system to have multi-character conversations with large language models (like Llama 3.1), in which the characters are isolated in terms of memories and bios, so no leakage to other participants. Here’s the GitHub repo, that now even features a proper readme.

In the previous part, the protagonist realized that the alien Zha’thik, who had subjected young Elizabeth Harrow to a ritual intended to turn her into some sort of cosmic entity, had fallen in love with the earthly teenager. The team convinced Zha’thik to let Elizabeth endure her changes back at home. The alien was even kind enough to open a dimensional portal back to Earth.

Here’s the somber resolution of this story.

Three days later, the pair of detectives are expecting to meet up with brilliant scholar Elara Thorn at the hole-in-the-wall where they first met.

The end. That’s all the cosmic horror I had to give for now.

This first serious playthrough of my app proved that the system can handle a full story. Highlights for me: how unique most characters sounded with the combination of dedicated bios and speech patterns, along with the voice models. The brilliance of their speeches regularly surprised me (and highlighted the chasm between their intelligence and my limited human capabilities), and there were times in which I forgot that I wasn’t writing to an actual human being.

Quite a few times, I wasn’t sure how to continue with the story (I didn’t want to create a plan beforehand, given that I intended to play this out as if I were partaking in roleplaying sessions), but thankfully the “concepts system,” in which the user can generate scenarios, goals, plot twists, dilemmas, etc. helped push me along. When more complicated feedback was required, the Writers’ Room feature, in which a team of AI agents representing the various role in a writers’ room handle your requests, solved the remaining issues. When I wanted to brainstorm the specifics of a location or a character, I proposed the topic to the swarm of agents, and they always provided just the stuff I needed.

Issues: first, a mechanical one of my app: when your characters are going to interact with a place that isn’t connected to the world > region in which they started in, that involved me editing the new hierarchy of places into the JSON data files. I solved that issue this morning: there’s now an Attach Places page that displays the available templates, and lets the user attach them with a simple click. That could solve most of such issues.

The bigger issue, though, were the large language models (the AI) themselves. Right now there are various contenders for the heavy-hitters depending on how much you’re willing to spend. Hermes 405B was great for regular writing and dialogue, the best one I had come across that remained uncensored, until I came across Magnum 72B. Unfortunately Magnum is considerably slower, and much worse, it has a 16k context window due to the sole provider, meaning that I had to change back to Hermes 405B when the text sent to the LLM became too long. By far, though, the best large language model for dialogue I’ve come across is Claude Sonnet, at 15 dollars per million words of output. That’s steep, although not remotely as much as OpenAI’s Orion preview. Sonnet is likely censored, but I haven’t had issues with moderation in the tests I’ve gotten it tangled with (and they involved steamy stuff).

Next up, something to which I’m drawn instinctively: deranged silliness with perverted undertones. The protagonist will be a somewhat over-the-top teenager who gets reincarnated into a fantasy world. Expect loads of bizarre characters and zany situations. Possibly some monster sex. I’ve already produced the first “episode” of it, and it has been delightful.

Anyway, don’t know if anyone has followed this first story, but if you have enjoyed it, then great.

Neural narratives in Python #25

I recommend you to check out the previous parts if you don’t know what this “neural narratives” thing is about. In short, I wrote in Python a system to have multi-character conversations with large language models (like Llama 3.1), in which the characters are isolated in terms of memories and bios, so no leakage to other participants. Here’s the GitHub repo, that now even features a proper readme.

In the previous part of this little tale, the team of heroes found the missing girl, Elizabeth Harrow, but the ritual she had been subjected to had turned her into something not quite human. The culprit, an alien named Zha’thik, showed herself to be impervious to bullets.

Here’s the disconcerting climax of this little story.

Notes on this part: I genuinely had no clue how to resolve this situation, hence the protagonist requesting valid plans from others. When Zha’thik referred to her and Elizabeth being together, I saw an opening, and ran with it. Turned out better than I expected.

Anyway, the story will end in the following part. I have already produced it, so I’ll probably post it tomorrow.

Neural narratives in Python #24

I recommend you to check out the previous parts if you don’t know what this “neural narratives” thing is about. In short, I wrote in Python a system to have multi-character conversations with large language models (like Llama 3.1), in which the characters are isolated in terms of memories and bios, so no leakage to other participants. Here’s the GitHub repo, that now even features a proper readme.

We abandoned our team of heroes as they frantically tried to collapse the pocket dimension that contained them, before the multiplying tears in reality overwhelmed them.

As I was preparing the setting for this scene, it became obvious that I needed a new abstraction in my hierarchy of places. You see, many prompts to the large language models get fed the combination of places where the scene happens: story universe > world > region > area, and possibly location as well. But in this case, the characters were in a distinct inner chamber inside a sanctuary. The sanctuary would be the location, while the inner chamber was a sub-location, or a room. So clearly, given that the story has demanded it, instead of “half-assing” such places into the hierarchy, I should modify all the code that touches the hierarchy to have in mind a new type of place: a Room. Perhaps they won’t be used that often, but if a serious playthrough has required it, then I need to program it in. This may take a while; the last time I touched the hierarchy of places, that is almost the spine of the whole app, it took me days to return the app to normal. We’ll see how it goes.

Neural narratives in Python #23

I recommend you to check out the previous parts if you don’t know what this “neural narratives” thing is about. In short, I wrote in Python a system to have multi-character conversations with large language models (like Llama 3.1), in which the characters are isolated in terms of memories and bios, so no leakage to other participants. Here’s the GitHub repo, that now even features a proper readme.

In the previous part, our team of heroes found themselves trapped in a pocket dimension created by the malicious alien Zha’thik, a dimension made from the memories of Elizabeth Harrow so she would stay inside in a sort of dream state. The team will attempt to collapse the pocket dimension to return to the containing dimension, the so-called Mirage Wastes.

They finally prepare themselves to do something as troublesome as collapsing a pocket dimension with themselves inside.

Intense episode.

Neural narratives in Python #22

I recommend you to check out the previous parts if you don’t know what this “neural narratives” thing is about. In short, I wrote in Python a system to have multi-character conversations with large language models (like Llama 3.1), in which the characters are isolated in terms of memories and bios, so no leakage to other participants. Here’s the GitHub repo, that now even features a proper readme.

At the end of the previous part, we left our jaded protagonist boxing an eldritch horror while his long-dead teenage lover ran for her life.

I had no clue where Cassidy’s portal would lead, or even if it would work. I organized a session with my Writers’ Room to put together the following scene, that involved generating a couple of new characters.

Those noises are artifacts that rarely happen when generating voices. It’s odd that it happened three times in this segment, but they seemed to fit the eeriness, so I left them in. I apologize for minor errors such as using “his” when referring to Elizabeth. My brain is mostly mush at this point.

When this scene started, I had only included the main team in the dialogue, but I wanted Gideon’s deceased wife and the young version of their daughter to show up in the middle of the conversation. I had no such system implemented, so I wrote it in: a new collapsible section in the chat page that if there are characters present in the same place who aren’t involved in the dialogue, you can simply add them, and they’ll be included in the prompts to the large language model from then onwards. It was surprisingly easy to do, which I suppose is a testament to how mature the app is at this point.