You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
LLM output reliability is critical, particularly for numerical operations and action execution. Upsonic addresses this through a multi-layered reliability system, enabling control agents and verification rounds to ensure output accuracy.
Upsonic is a reliability-focused framework. The results in the table were generated with a small dataset. They show success rates in the transformation of JSON keys. No hard-coded changes were made to the frameworks during testing; only the existing features of each framework were activated and run. GPT-4o was used in the tests.
10 transfers were performed for each section. The numbers show the error count. So if it says 7, it means 7 out of 10 were done incorrectly. The table has been created based on initial results. We are expanding the dataset. The tests will become more reliable after creating a larger test set. Reliability benchmark repo
Install the dependencies and create an environment
pip install uv
uv venv
uv sync
Set your envinronment variable in .env
# for UpsonicAZURE_OPENAI_ENDPOINT="https://**.com/"AZURE_OPENAI_API_VERSION="****-**-**"AZURE_OPENAI_API_KEY="***"
# for CrewAIAZURE_API_KEY="***"AZURE_API_BASE="https://**.com/"AZURE_API_VERSION="****-**-**"#for LangGraphOPENAI_API_VERSION="****-**-**"