We support every LLM, here's how we did it
Rosetta: Helicone's Solution to Universal LLM Support
“The diverse architectures and protocols of today's LLMs present unprecedented integration challenges, setting the stage for innovative solutions in AI observability.” - Nietzsche
The Problem: Fragmented LLM Ecosystem
In the AI industry, the explosion of large language models (LLMs) has led to a fragmented ecosystem. Different LLMs operate on distinct schemas and protocols, posing integration challenges for businesses seeking a unified AI observability solution. This fragmentation hinders seamless monitoring, analytics, and scalability.
Introducing Rosetta: Helicone's Solution
Helicone’s response to this challenge is Rosetta - an LLM-powered mapper generator. These mappers translate the diverse outputs and functionalities of various LLMs into a consistent schema that integrates seamlessly with Helicone's platform.
Core Components
Constructing this solution requires five key components:
Unified and Executable Mapper Format
Mapper Identification
Intelligent Mapper Versioning
Scalability & Speed
Executable Code Security
Unified and Executable Mapper Format
Create a standard, versatile format to translate and execute mappings across diverse LLM data structures.
Sandbox Environment: Offers high security and isolation but is complex to implement.
JFilters: Simplifies processing but may struggle with complex JSON structures.
Pure JavaScript with Eval: Provides flexibility and adaptability; chosen for its ability to handle diverse data formats despite potential security concerns, mitigated by a strict approval process.
Mapper Identification
For identifying the appropriate mapper, we considered two main approaches:
Key-Based Identification: This straightforward method involves assigning unique keys, such as '/chat/completions', to each mapper.
Similarity Comparisons on JSON Bodies: A more advanced method that involves analyzing the JSON bodies for similarities to determine the appropriate mapper.
Ultimately, the decision to use key-based identification was driven by the need for a simple, efficient, and reliable system that can be easily managed and scaled as we grow.
Intelligent Versioning
Implement a robust versioning system to manage updates in mappers, accommodating new fields in either the output or input schema.
Key Extraction: Starts with extracting keys and JSON structure from the input.
Key Analysis: Identifies unmapped or 'ignored' keys (present in input but not in output schema).
Mapper Update: Updates existing mapper when new keys are detected, adding or ignoring them as needed.
Schema Monitoring: Continuously checks for changes in output schema through hash comparisons, triggering new mapper versions as required.
Scalability & Speed
With Helicone mapping the LLM outputs on read due to the rapidly evolving LLM schemas, both scalability and speed are crucial.
Design the architecture to be scalable, and capable of handling varying loads and complexities efficiently. To support scalability, it’s extremely important to not have to call an API to map, but instead to have the mapper in code or executed in code and retrieved from a cache system.
Executable Code Security
In Rosetta, each new mapper faces an approval process, crucial for two reasons. First, it addresses the inherent security risks associated with using eval
for executing mappers. Second, it verifies the mapper's functionality, ensuring it operates as intended. If a mapper fails or encounters issues, our system automatically reverts to a previous, stable version, providing a reliable fallback mechanism.
Rosetta Mapper Lifecycle
Conclusion
Rosetta is Helicone's answer to the fast-changing world of large language models - a smart, secure bridge across the fragmented LLM landscape. So, what do you think - could Rosetta, as a standalone product, be the missing piece in your AI strategy? We'd love to hear if you're as excited about its potential as we are.
Great article!