Agent building and orchestration
Coding assistants
VS Code AI Toolkit
Develop AI agents and apps using the lightweight VS Code extension.
Evaluation and monitoring

LM evaluation harness
Benchmark models with LM evaluation harness for reproducible, large-scale LLM evaluation.
Weave
Weave
Debug, evaluate, and improve LLM applications with real-time observability and experimentation.