Home
|
Queries
|
Models
|
LLM Diff
|
Translate
|
History
LLM Diff / Eval
Run
Select models and queries to run. Defaults are all checked.
Last run: 2026-01-31 17:14:11
Models
Select all
Clear
DeepSeek V3.2
(fireworks)
GLM 4.7
(fireworks)
Kimi K2.5
(fireworks)
Qwen3 235B A22B Instruct 2507
(fireworks)
Manage models at
/models
Queries
Select all
Clear
why the sky is blue?
what's your name?
tell me a joke? for 10 years old kids.
explain difference between reasoning vs non-reasoning model.
where is the capital of US?
What should senior engineer leader do? Be brief, only list the top 3 most important things they need to do day 2 day.
Prove that pi is irrational
Manage queries at
/queries
Run benchmark
Running in progress…