What’s On: Enhancing an AI with data from a local database

What’s On is a simple Go program that interacts with Anthropic’s Claude AI to work with data from a local database.

AI
Claude
Go
Author

John Bates

Published

July 10, 2024

What’s On is a simple Go command line program to interact with the Claude API (using go-anthropic) to return a number of TV programme viewing suggestions given a statement of desire. The purpose of the program is to allow me to experiment with Claude embedding, its use of tools, and its XML input and output techniques. And the purpose of this notebook is to remind me how it works.

The program uses a local database of TV schedules which Claude requests access to through its ‘tool’ mechanism. To avoid having to recreate the database daily, dates in the database are automatically shifted to appear as if they start from the current date.

The Go code for What’s On can be found here: https://github.com/Vextasy/WhatsOn.

Usage

./whatson [-d] [-v] statement of desire ...

The -d flag causes the program to follow the list of suggestions with a description of each program. Claude, being an AI, can of course make changes to those descriptions and this can be a way in which you can gain some insight into why it has made a suggestion.

For example:

./whatson I would like to watch programmes about archaeology that are on in the next couple of weeks.

- Sun 2024-07-07 at 20:00: Digging for Britain on BBC4
- Sun 2024-07-14 at 20:00: Digging for Britain on BBC4

An explanation of Claude’s thinking can be viewed by providing the -v flag.

./whatson -v I would like to watch programmes about archaeology that are on in the next couple of weeks.

To provide appropriate TV programme suggestions about archaeology for the next couple of weeks, I'll need to use the get_tv_programmes function. Let's gather the necessary information:

1. The current date is 2024-06-30 (Sunday).
2. The user wants programmes about archaeology in the next couple of weeks.

I'll set the date range from today (2024-06-30) to two weeks later (2024-07-14) to cover the requested period.

Using the tool date range: 2024-06-30 to 2024-07-14
- Sun 2024-07-07 at 20:00: Digging for Britain on BBC4
- Sun 2024-07-14 at 20:00: Digging for Britain on BBC4

A description of each programme can be obtained with the -d flag.

./whatson "Show me suggestions for programmes that are related to animals and are on in the next couple of weeks."

- Sun 2024-07-02 at 18:00: Countryfile on BBC1
- Sun 2024-07-02 at 20:00: The Great British Bake Off on Channel 4
- Fri 2024-07-05 at 20:00: The Secret Life of the Zoo on BBC4
- Mon 2024-07-08 at 22:30: The Sky at Night on BBC4
- Sun 2024-07-14 at 20:00: Digging for Britain on BBC4

Asking Claude to explain itself through the descriptions can be insightful.

./whatson -d "Show me suggestions for programmes that are related to animals and are on in the next couple of weeks.  Update the programme description to show how it is related to animals."

- Sun 2024-07-02 at 18:00: Countryfile on BBC1
- Sun 2024-07-02 at 20:00: The Great British Bake Off on Channel 4
- Fri 2024-07-05 at 20:00: The Secret Life of the Zoo on BBC4
- Mon 2024-07-08 at 22:30: The Sky at Night on BBC4
- Sun 2024-07-14 at 20:00: Digging for Britain on BBC4

Descriptions:
- Countryfile : Exploring rural issues and celebrating the beauty of the British countryside, with segments featuring farm animals
- The Great British Bake Off : Amateur bakers compete in a series of challenging baking tasks using ingredients like eggs and dairy from animals.
- The Secret Life of the Zoo : Behind-the-scenes look at the lives of various animals and their keepers at Chester Zoo, providing intimate insights into animal behavior, care, and conservation efforts.
- The Sky at Night : While primarily about astronomy, this episode may explore the impact of space exploration on animal research, such as studying the effects of microgravity on various species or discussing animals used in early space missions.
- Digging for Britain : This archaeology program often uncovers animal remains and artifacts related to historical human-animal interactions, providing insights into ancient fauna and our evolving relationships with animals throughout British history.

Apologies to any vegetarians, but the explanation that Claude gives for including the Great British Bake Off in the list of programmes related to animals is a lovely example of how tricky it is to ensure that your aims and what the AI takes to be your aims are aligned.

Techniques

In addition to using the very nice interface that go-anthropic provides to allow interaction with Claude using Go, this application makes use of three techniques that are described in the Claude documentation here:

  1. Embedding
  2. Tool Use
  3. XML Tags

Embedding

The Claude documentation describes embeddings as “numerical representations of text that enable measuring semantic similarity”. So, strictly, in this application we don’t do proper embedding. But I was keen to see how embeddings could be used to incorporate data from a local database into the interaction with an AI and the technique that is used here is similar to what would be done when performing proper embedding.

An embedding model can be trained on a local dataset and will return documents that most closely match a given query. These documents can then be passed to Claude as part of the prompt generation process in order that Claude might use the information that they contain to help answer the original query.

Claude doesn’t have its own embedding model but, rather, works with an external embedding mechanism. Currently they recommend using Voyage AI.

For this simple application, our local database is a collection of TV programme schedules. Each programme has a name, a channel, a date, a time and a description. We encourage Claude to ask for details of TV programmes that are scheduled to be broadcast between a given range of dates. We could equally well have asked Claude to request programmes based on a more complex query, such as by genre or subject matter but then we would need to consider using a proper embedding database in order to provide a good response. So by keeping the local database queries to just a range of dates we can handle selection from the database with a very basic model.

Tool Use

Claude allows user-defined tools to be described as part of a request. A tool can be thought of as a function that Claude might call to request certain information. As well as giving the tool a name, and a description that describes to Claude what the tool can be used for and how to use it, it also enumerates any parameters that Claude should supply when calling the tool.

If Claude thinks that it could benefit from using a particular tool it will return a response to its initial query containing a request to call the tool and will supply the necessary tool parameters. A request to make use of a tool will appear in a response with a content.Type == anthropic.MessagesContentTypeToolUse. And the parameters can be obtained from the content.MessagecontentToolUse field.

In an application making real use of an embedding database we might ask Claude to supply a query that could be made to the database to return a set of documents with which to enhance the subsequent request to Claude. In our rather simple example the tool use request will contain the date range between which Claude would like to know about programmes from our TV database.

The tools available to Claude are described in the Tools section of the initial message request. Our get_tv_programmes tool is described like this:

Tools: []anthropic.ToolDefinition{
    {
        Name: "get_tv_programmes",
        Description: `Gets information about the TV programmes that will be aired between the given from_date and to_date inclusive.
            If to_date is not provided, it will default to the same date as the from_date.
            It should be used when the user asks about TV programmes and we want to know which programs will be shown in the near future including today.
            Programme information will be returned in XML tags with the following format:
                <Programmes>
                    <Programme>
                        <Channel>Channel Name</Channel>
                        <Name>Programme Name</Name>
                        <Date>YYYY-MM-DD</Date>
                        <Time>HH:MM</Time>
                        <Description>Programme Description</Description>
                    </Programme>
                </Programmes>
            The outer level <Programmes></Programmes> tag will be omitted if the result is empty,
            but will otherwise contain one or more <Programme></Programme> tags.
            The <Location></Location> tag contains the TV channel on which the program will be shown.
            `,
        InputSchema: jsonschema.Definition{
            Type: jsonschema.Object,
            Properties: map[string]jsonschema.Definition{
                "from_date": {
                    Type:        jsonschema.String,
                    Description: "The start date of the TV programmes search. Format: YYYY-MM-DD",
                },
                "to_date": {
                    Type:        jsonschema.String,
                    Description: "The end date of the TV programmes search. Format: YYYY-MM-DD",
                },
            },
            Required: []string{"from_date"},
        },
    }},

It makes it quite clear which parameters will be required and the format in which information will be returned from the tool.

XML Tags

Anthropic suggest that good use is made of XML tags to supply data to Claude or to return data from Claude. This is achieved by being explicit about the format of such input or output data in the prompt that is constructed to send to Claude.

In our case we construct the following initial prompt:

You are given information about TV programmes enclosed between <Programmes></Programmes> tags.
The date today is ` + time.Now().Format("2006-01-02") + `, a ` + time.Now().Weekday().String() + `.
You are an expert at suggesting TV programmes.
The user would like to have some suggestions about what TV programmes to watch but has the following strict request:<Request>` + desire + `</Request>.
Your task is to provide a list of programme suggestions in XML format using the following tag structure for each selection:
<Suggestion><Channel>Channel Name</Channel><Name>Programme Name</Name <Date>YYYY-MM-DD</Date><Time>HH:MM</Time><Description>Programme Description</Description></Suggestion>
Wrap the list of <Suggestion></Suggestion> tags in a <Suggestions></Suggestions> tag.
Do not output any other information either before or after the <Suggestions></Suggestions> tags.
If the user does not request that programs begin within a particular date range then use a range that begins today and ends 1 week later.
Return at most 10 suggestions.
If you have no suggestions then return the <Suggestions></Suggestions> tags with no content.
Before answering re-check that all of the user's requests have been met.
If you are not sure that a particular suggestion satisfies the user's Requests then do not return that suggestion.
If the user has requested programmes on a particular day of the week or days of the week check that only programmes on those days are returned.
`

The prompt explains to Claude that it can find programme schedule information between <Programmes> and </Programmes> tags and that it should return information between <Suggestions> and </Suggestions> tags. This makes it easy for Claude to read the results of its use of a tool and for us to read the output of the final interaction with Claude.

Constructing the TV Database

In order to construct the database of TV schedules we made use of Claude to generate some fictitious TV programmes and to assign them a schedule. The prompt needed to perform that task was:

I would like you to generate some fictitious TV listings data for me.

The data will be in XML format using the following tags:

<Programmes>
    <Programme>
        <Channel>Channel Name</Channel>
        <Name>Programme Name</Name>
        <Date>YYYY-MM-DD</Date>
        <Time>HH:MM</Time>
        <Description>Programme Description</Description>
    </Programme>
<Programmes>

Each programme is described by the contents of a pair of <Programme></Programme> tags.

Valid values for Channel are: BBC1, BBC2, BBC3, BBC4, ITV, Channel 4.

Here are some examples:

<Programmes>
    <Programme>
        <Channel>BBC1</Channel>
        <Name>The Night of the Hunter</Name>
        <Date>2024-06-20</Date>
        <Time>18:00</Time>
        <Description>A film about the life of a hunter and his family who travel from location to location in search of food.</Description>
    </Programme>
    <Programme>
        <Channel>BBC2</Channel>
        <Name>Would I Lie to You?</Name>
        <Date>2024-06-20</Date>
        <Time>19:00</Time>
        <Description>A comedy quiz in which contestants have to determine if the others are lying or telling the truth.</Description>
    </Programme>
    <Programme>
        <Channel>BBC4</Channel>
        <Name>Easter Island Origins</Name>
        <Date>2024-06-20</Date>
        <Time>19:00</Time>
        <Description>Documentary exploring how new evidence is challenging everything we thought we knew about Easter Island and the nearly 900 giant stone statues scattered across this remote Pacific island. They are some of the most famous and mysterious monuments on the planet. Nearly 900 giant stone heads scattered across a remote island in the middle of the Pacific. Now, brand new evidence is challenging everything we thought we knew about Easter Island’s awe-inspiring statues – and those who made them. Drawing on the latest science, this authoritative documentary radically rewrites the story of Easter Island.</Description>
    </Programme>
    <Programme>
        <Channel>BBC2</Channel>
        <Name>Elsa the Lioness</Name>
        <Date>2024-06-21</Date>
        <Time>18:00</Time>
        <Description>First transmitted in 1961, David Attenborough visits Joy and George Adamson in Kenya and meets Elsa the lioness and her cubs shortly before Elsa's death. First transmitted in 1961, David Attenborough travels to Meru National Park in Kenya to visit Joy and George Adamson and meet Elsa the lioness and her cubs shortly before Elsa's death.</Description>
    <Programme>
<Programmes>

I would like you to generate data for about 70 programmes with dates ranging from 2024-06-21 to 2024-06-30
and with starting times on the hour or at 15, 30 or 45 minutes past the hours.
Make the programme names and descriptions as realistic as possible.
The genres of some of the programmes should be film.

Such a simple thing saved a lot of programming time. As often seems to be the case when working with Claude, I was surprised at some of the detail in the fictitious schedule. The full database is shown in the GitHub repository in the file TvDb.xml. But, for example, the programme “Good Morning Britain” has been schedule for 06:30am and “The Sky at Night” for 10:30pm.

Final Thoughts

Even though this is a very simple application it made it more clear to me how an AI such as Claude could be programmed to make use of specific locally stored information and combine it with the language, and wider knowledge, that the AI has, resulting in a useful tool.

Being able to precisely describe the expected inputs and outputs makes it easier to work with Claude from a traditional programming language.

AIs, such as Claude, can often be used to help construct parts of the program. In our case it built our TV database, but it could also have been used to write some of the code used to interact with the SDK.

It is worth stressing in the prompts the importance of checking the results before returning them, and the -v flag to get Claude to explain its workings was a useful addition.