Tegy provider research

Multimodal model options that avoid the Qwen Max image-input failure

Comparison of moonshotai/kimi-k2.7-code, minimax/minimax-m3, google/gemini-3-flash-preview, and google/gemini-3.5-flash for Tegy's OpenRouter + Cloudflare AI Gateway path.

Prepared June 14, 2026. Prices and metadata are current OpenRouter listings unless otherwise noted. Anthropic and OpenAI models are intentionally excluded.

Safest cheap swap MiniMax M3 Image + video input, 1M listed context, lowest listed output cost.
Best benchmark signal Gemini 3.5 Flash Highest OpenRouter AA agentic index among these four; strong Google model-card scores.
Newest listing Kimi K2.7 Code June 12 OpenRouter listing, 1T total / 32B active parameters in Cloudflare docs.
Lowest-confidence pick Gemini 3 Flash Preview Works with images, but 3.5 Flash is a direct newer improvement in the same family.

Bottom line for Tegy

All four candidates support image input, so they should not produce the OpenRouter No endpoints found that support image input failure that happened with qwen/qwen3.7-max. For a production trial, I would test minimax/minimax-m3 first if cost and context matter most, and google/gemini-3.5-flash first if benchmark quality matters most.

Provider Metadata

Model OpenRouter listed / release date Input modalities Context Output cap OpenRouter pricing per 1M tokens Supported agent parameters
Kimi K2.7 Code
moonshotai/kimi-k2.7-code
June 12, 2026 textimage 262,144 tokens 262,144 tokens Input $0.75
Output $3.50
Cache read $0.16
toolstool_choicereasoningstructured_outputs
MiniMax M3
minimax/minimax-m3
May 31, 2026 textimagevideo 1,048,576 listed; top provider shows 524,288 512,000 tokens Input $0.30
Output $1.20
Cache read $0.06
toolstool_choicereasoningstructured_outputs
Gemini 3 Flash Preview
google/gemini-3-flash-preview
December 17, 2025 textimagefileaudiovideo 1,048,576 tokens 65,536 tokens Input $0.50
Output $3.00
Image $0.50
Audio $1.00
toolstool_choicereasoningstructured_outputs
Gemini 3.5 Flash
google/gemini-3.5-flash
May 19, 2026 on OpenRouter; Google model card published May 2026 textimagefileaudiovideo 1,048,576 tokens 65,536 tokens Input $1.50
Output $9.00
Image $1.50
Audio $3.00
toolstool_choicereasoningstructured_outputs

Benchmark / Model Strength Signals

Model Artificial Analysis indices via OpenRouter Other benchmark / model-size signal Read
Kimi K2.7 Code Not exposed in the OpenRouter model metadata at report time. Cloudflare Workers AI docs list 1T total parameters, 262.1k context, vision input, tool calling, reasoning, and structured outputs. Kimi's release note reports +21.8% on Kimi Code Bench v2 over K2.6, but that is a vendor-family delta rather than a cross-model benchmark. Newest and transparent on parameter scale. Benchmark coverage is thinner than Gemini/MiniMax in the public metadata.
MiniMax M3 Intelligence 54.7
Coding 43.4
Agentic 68.6
MiniMax reports SWE-Bench Pro 59.0%, Terminal-Bench 2.1 66.0%, SWE-fficiency 34.8%, KernelBench Hard 28.8%, and MCP Atlas 74.2%. Best cost/context profile here, and the OpenRouter agentic index is better than Qwen Max's 66.6 baseline.
Gemini 3 Flash Preview Intelligence 46.4
Coding 42.6
Agentic 49.7
Google's Gemini 3 developer guide lists 1M / 64k context and $0.50 / $3 pricing. The Gemini 3.5 model card gives Gemini 3 Flash baseline scores including Terminal-bench 58.0%, SWE-Bench Pro 49.6%, MCP Atlas 62.0%, CharXiv Reasoning 80.3%, and MMMU-Pro 81.2%. Capable and cheap, but mostly superseded by 3.5 Flash for this shortlist.
Gemini 3.5 Flash Intelligence 55.3
Coding 45.0
Agentic 70.3
Google model card reports Terminal-bench 2.1 76.2%, SWE-Bench Pro 55.1%, MCP Atlas 83.6%, OSWorld-Verified 78.4%, CharXiv Reasoning 84.2%, and MMMU-Pro 83.6%. Best quality signal of the four. The tradeoff is price: output is 7.5x MiniMax M3 and 2.57x Kimi K2.7 Code.

Ranking for Tegy Trials

1. MiniMax M3 for first cost-sensitive production trial

It is multimodal, cheap, long-context, explicitly oriented toward long-horizon agentic work, and has a stronger OpenRouter agentic index than the Qwen Max baseline. It is the most pragmatic candidate if the goal is to keep costs low while fixing image turns.

2. Gemini 3.5 Flash for best benchmark quality

It has the best public benchmark signal in this comparison and broad multimodal input support, including files. The downside is cost, especially output tokens and internal reasoning.

3. Kimi K2.7 Code as the newest coding-agent option

Kimi is attractive because it is fresh, vision-capable, tool-capable, and large. It needs a real Tegy smoke test because its cross-model benchmark metadata is less complete.

4. Gemini 3 Flash Preview only if 3.5 Flash is too expensive

It has the same broad modality family and lower pricing than 3.5 Flash, but the benchmark gap is clear enough that it should not be the default unless cost or availability forces it.

Operational Caveats

Image capability in OpenRouter metadata is necessary but not sufficient. Before switching Tegy, run the exact Claude Agent SDK path through Cloudflare AI Gateway with: text-only chat, image upload, PDF/DOCX Sprinta analysis, natural questionnaire creation, tool/plugin usage, and cache-status capture.

The prior crash was caused by a text-only endpoint rejecting image blocks. This report compares models that advertise image input; it does not prove their behavior under Tegy's complete Claude SDK loop.

Sources

  1. OpenRouter models API: https://openrouter.ai/api/v1/models
  2. OpenRouter Qwen model family page: https://openrouter.ai/qwen
  3. MiniMax M3 release and benchmark post: https://www.minimax.io/blog/minimax-m3
  4. Cloudflare Workers AI Kimi K2.7 Code page: https://developers.cloudflare.com/workers-ai/models/kimi-k2.7-code/
  5. Google Gemini 3 developer guide: https://ai.google.dev/gemini-api/docs/gemini-3
  6. Google DeepMind Gemini 3.5 Flash model card: https://deepmind.google/models/model-cards/gemini-3-5-flash/
  7. Kimi release announcement on Reddit: https://www.reddit.com/r/kimi/comments/1u3ri2w/kimik27code_our_latest_coding_model_is_now/