At MeetKai we believe that tool use —and higher order orchestration around them— is the most important area to focus on when it comes to real world applications of LLMs. In the release of Functionary 2.4, we have improved the overall accuracy in function calling and introduced early support for code interpreter style functions. Our OpenAI API compatible MIT Licensed small and medium variant models can be accessed here:
Small Variant (functionary-small-v2.4)
Medium Variant (functionary-medium-v2.4)
Functionary 2.4 continues to be the most powerful and fully featured tool use capable model that is open source, licensed under the permissive MIT license. It is the only open model capable of 1:1 compatibility with all OpenAI capabilities.
SGD is a primary benchmark dataset for us, as it resembles real-world customer applications we have had of our LLM. A key requirement is the model's ability to correctly prompt users for missing information needed for legal function calls. For example, if someone asks, "What's the weather," and a weather tool with a "city" argument exists, the model should ask the user which city they want weather information for. This is crucial to prevent the model from hallucinating the argument or passing in an empty string.
Functionary 2.3 had a significant gap in handling tool use for data analysis queries, which rely on a code interpreter tool that executes code and uses the result to generate a response. In 2.3, we could correctly predict the tool, but the generated code was often invalid or incorrect. Even with correct results, the model struggled to properly ground a response in the code result. Functionary 2.4 has significantly improved this aspect. Keep an eye out for new tutorials on how to best utilize this feature (follow us on X! @meetkaiinc), and to get started quickly, here's a tool to help you:
Functionary 2.5, targeted for release in May, will focus on improving larger context grounding. This will enable the model to better utilize tool results, such as injected context from a long context RAG "search" tool, reducing hallucination and generating more accurate end results. In early testing on internal real-world benchmarks, the medium variant of our 2.4 model already outperforms other permissively licensed open models. With 2.5 we want to set a new standard here even compared to proprietary models.
As always, feel free to reach out to us (email: [email protected]). At MeetKai, we work with companies of all sizes to deploy end applications that leverage LLMs to deliver concrete value —from immersive workplace training to traditional chatbots and everything in between. If you have any issues in using functionary, feedback, or feature requests please feel free to open an issue on our GH repo.