Introducing the New Anthropic Token Counting API

Anthropic has released some exciting beta features in the last couple of days that have largely gone under the radar. One of these was the ability to process PDFs with their models, which can now understand both text and visual content within PDF documents. I’ll maybe write up something on that at a later date.

The other exciting beta feature, and the subject of this article, was the introduction of token counting. Crucially, you can count the tokens in user messages, PDFs and images before you send them to Claude. This is excellent news for those who like to monitor their token usage costs closely.

According to the official announcement from Anthropic (link here),

"The token counting endpoint accepts the same structured list of inputs for creating a message, including support for system prompts, tools, images, and PDFs. The response contains the total number of input tokens."

And supports the following models,

"Claude 3.5 Sonnet Claude 3.5 Haiku Claude 3 Haiku Claude 3 Opus"

The good news is that token counting is free to use but subject to requests per minute rate limits based on your usage tier.

For the rest of this article, we’ll go through some examples of using the token counting API to count tokens in user/system messages, PDFs and images.

To make things more interactive, once we have the basics of our code developed, we’ll wrap up the functionality in a Gradio app that will display a nice user interface to enter user text or upload PDFs and images, then count the tokens. It’ll look a bit like this,

Ok, let’s get started. First off, I’m developing using Windows WSL2 Ubuntu. If you’re a Windows user, I have a comprehensive guide on installing WSL2, which you can find here.

Setting up a dev environment

Before we start coding, let’s set up a separate development environment. That way, all our projects will be siloed and won’t interfere with each other. I use conda for this, but use whichever tool you’re familiar with.

(base) $ conda create -n token_count python=3.10 -y
(base) $ conda activate token_count
# Install required Libraries
(token_count) pip install anthropic jupyter

Getting an Anthropic API key

You can get that from the Anthropic Console. Register or Sign-In, then you’ll see a screen like this,

Click the Get API Keys button and follow the instructions from there. Take note of your key and set the environment variable ANTHROPIC_API_KEY to it.

The code

Example 1 – Counting tokens in the user and system prompts.

import anthropic
import os

client = anthropic.Anthropic()

response = client.beta.messages.count_tokens(
    betas=["token-counting-2024-11-01"],
    model="claude-3-5-sonnet-20241022",
    system="""
        You are a helpful assistant and will respond to users's queries 
        in a polite, friendly and knowledgable manner
    """,
    messages=[{
        "role": "user",
        "content": "What is the capital city of France"
    }],
)

print(response.json())

#
# Output
#

{"input_tokens":41}

Example 2— Counting tokens in a PDF

For my input PDF, I’ll use a copy of Tesla’s Q10 September 2023 quarterly submission to the Securities and Exchange Commission. This document is 51 pages of mixed text and tabular data. You can see what it looks like online by clicking here.

import base64
import anthropic

client = anthropic.Anthropic()

with open("/mnt/d/tesla/tesla_q10_sept_23.pdf", "rb") as pdf_file:
    pdf_base64 = base64.standard_b64encode(pdf_file.read()).decode("utf-8")

response = client.beta.messages.count_tokens(
    betas=["token-counting-2024-11-01", "pdfs-2024-09-25"],
    model="claude-3-5-sonnet-20241022",
    messages=[{
        "role": "user",
        "content": [
            {
                "type": "document",
                "source": {
                    "type": "base64",
                    "media_type": "application/pdf",
                    "data": pdf_base64
                }
            },
            {
                "type": "text",
                "text": "Please summarize this document."
            }
        ]
    }]
)

print(response.json())

#
# Output
#

{"input_tokens":118967}

Example 3 – Counting tokens in an image

This is the image I’ll use.

It’s a PNG and approximately 2.6MB in size.

import anthropic
import base64
import httpx

image_url = "/mnt/d/images/android.png"
image_media_type = "image/png"
# Read the image file and encode it to base64
with open(image_path, "rb") as image_file:
    image_data = base64.standard_b64encode(image_file.read()).decode("utf-8")

client = anthropic.Anthropic()

# Create the request using the locally stored image
response = client.beta.messages.count_tokens(
    betas=["token-counting-2024-11-01"],
    model="claude-3-5-sonnet-20241022",
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": image_media_type,
                        "data": image_data,
                    },
                },
                {
                    "type": "text",
                    "text": "Describe this image"
                }
            ],
        }
    ],
)

print(response.json())

#
# Output
#

{"input_tokens":1575}

Note that in all the above examples, no requests were set to the Llm to answer any user questions. It was just token counting.

Pulling it all together into a Gradio app.

Now that we have all the code we need, let’s design a user interface for it using Gradio.

We need two input text boxes, one for an optional system prompt and one for an optional user prompt.

Next, we’ll need an input field where the user can select PDF or image files to upload. Below this field, there will be a Add button to allow the user to add the files chosen above. The names of any chosen files or images will be displayed in a message box.

Finally, there will be a button that calls the code to calculate the token cost and a button to clear all input and output fields.

We can do this part using an LLM. It took a bit of back and forth with the LLM, but eventually, with GPT4-o’s help, I developed this code. It’s heavily commented on, so it should be relatively easy to follow.

# Import Gradio for building the web app interface
import gradio as gr
# Import Anthrop client for token counting API
import anthropic
# Import base64 for encoding files in base64 format
import base64
# Import os for interacting with the file system (though not used in this script)
import os

# Initialize the Anthropic client to access the API functions
# need to have your ANTHROPIC_API_KEY environment variable set
client = anthropic.Anthropic()

# Define a function to handle file uploads incrementally, allowing files to be added without overwriting previous uploads
def add_files(uploaded_files, current_files):
    # Initialize the current_files list if it's empty
    if current_files is None:
        current_files = []

    # Append any newly uploaded files to the current list of files
    if uploaded_files:
        current_files.extend(uploaded_files)

    # Create a list of file names for display purposes
    file_names = [file.name for file in current_files]

    # Return the updated file list, the display names, and clear the uploaded_files input
    return current_files, file_names, None

# Define a function to count tokens in system and user prompts, as well as in uploaded files
def count_tokens(system_prompt, user_prompt, all_files):
    # Check if all inputs are empty or cleared; if so, return 0
    if not system_prompt and not user_prompt and not all_files:
        return 0

    # Initialize an empty list to store the message objects for the API request
    messages = []

    # Add the user prompt to the messages list if it's provided
    if user_prompt:
        messages.append({
            "role": "user",
            "content": user_prompt
        })

    # Process each uploaded file, determining whether it's a PDF or an image
    if all_files:
        for file in all_files:
            # Get the file type by extracting and converting the file extension to lowercase
            file_type = file.name.split(".")[-1].lower()

            # If the file is a PDF, encode it in base64 and prepare a document message
            if file_type == "pdf":
                with open(file.name, "rb") as f:
                    pdf_base64 = base64.standard_b64encode(f.read()).decode("utf-8")
                pdf_content = {
                    "type": "document",
                    "source": {
                        "type": "base64",
                        "media_type": "application/pdf",
                        "data": pdf_base64
                    }
                }
                # Add the PDF message to the messages list with a prompt for summarization
                messages.append({
                    "role": "user",
                    "content": [pdf_content, {"type": "text", "text": "Please summarize this document."}]
                })

            # If the file is an image (JPEG or PNG), encode it in base64 and prepare an image message
            elif file_type in ["jpg", "jpeg", "png"]:
                media_type = f"image/{file_type}"
                with open(file.name, "rb") as f:
                    image_base64 = base64.standard_b64encode(f.read()).decode("utf-8")
                image_content = {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": media_type,
                        "data": image_base64,
                    }
                }
                # Add the image message to the messages list with a prompt to describe it
                messages.append({
                    "role": "user",
                    "content": [image_content, {"type": "text", "text": "Describe this image"}]
                })

    # If no prompts or files are provided, add a placeholder message
    if not messages:
        messages.append({
            "role": "user",
            "content": ""
        })

    # Call the Anthrop API to count tokens, using system prompt and messages as input
    response = client.beta.messages.count_tokens(
        betas=["token-counting-2024-11-01", "pdfs-2024-09-25"],
        model="claude-3-5-sonnet-20241022",
        system=system_prompt,
        messages=messages,
    )

    # Return the total number of tokens counted
    return response.input_tokens

# Define a function to clear all input fields in the Gradio app
def clear_inputs():
    return "", "", [], "", ""

# Build the Gradio interface
with gr.Blocks(theme="huggingface") as app:
    # Display a title for the app
    gr.Markdown("<h1 style='text-align: center;'>Anthropic Token Counter</h1>")

    # Create input fields for system and user prompts
    with gr.Row():
        system_prompt = gr.Textbox(label="System Prompt", placeholder="Enter the system prompt here...", lines=3)
        user_prompt = gr.Textbox(label="User Prompt", placeholder="Enter the user prompt here...", lines=3)

    # Create an upload field for multiple PDF or image files
    uploaded_files = gr.File(label="Upload PDF(s) or Image(s)", file_count="multiple", file_types=[".pdf", ".jpg", ".jpeg", ".png"])

    # Create a state variable to hold the list of currently uploaded files
    current_files = gr.State([])

    # Display a text box to show the names of uploaded files
    file_display = gr.Textbox(label="Uploaded Files", interactive=False) 

    # Define buttons for adding files, counting tokens, and clearing inputs
    add_files_button = gr.Button("Add Files")
    with gr.Row():
        count_button = gr.Button("Count Tokens", size="small")
        clear_button = gr.Button("Clear", size="small")

    # Display the token count result in a text box
    result = gr.Textbox(label="Token Count", interactive=False)

    # Configure the "Add Files" button to append files to the current file list
    add_files_button.click(fn=add_files, inputs=[uploaded_files, current_files], outputs=[current_files, file_display, uploaded_files])

    # Configure the "Count Tokens" button to process the prompts and files, displaying the token count
    count_button.click(fn=count_tokens, inputs=[system_prompt, user_prompt, current_files], outputs=result)

    # Configure the "Clear" button to reset all inputs and the token count display
    clear_button.click(fn=clear_inputs, outputs=[system_prompt, user_prompt, current_files, file_display, result])

# Launch the Gradio app
app.launch()

To use the app, do the following.

Enter a system and/or user prompt if required. You can leave these blank if you want.
To upload one or more files, drag a file into the file upload box or click on it and choose a file. After this, click the Add button, and your chosen file should appear in the Uploaded Files list box.
Repeat the step above to add more files if you want
Click the Count Tokens button to display a count of the tokens in all uploaded files and/or any text entered into the user or system prompts
Click the Clear button to reset everything and start from scratch

Here’s an example run where I uploaded 2 PDF files and an image along with a user prompt.

Summary

In this article, I wrote about an announcement made by Anthropic about a new token-counting API that had been released in beta. I then went on to use the API to develop code that counts tokens for user and system prompts, as well as for uploaded images and PDF documents.

I then showed how you would develop a user interface for the code using Gradio, bundling the code we developed into the app.

Finally, I showed what the app looks like and provided a working example of its use.

_Ok, that’s all for me for now. Hopefully, you found this article useful. If you did, please check out my profile page at this link. From there, you can see my other published stories and subscribe to get notified when I post new content._

I know times are tough and wallets constrained, but if you got real value from this article, please consider buying me a wee dram.

If you liked this content, I think you’ll find these articles interesting, too.