The pitfall of Open-weight LLMs

1 hiddenest 1 6/25/2025, 11:44:14 PM
Some startups are fine-tuning open LLMs instead of using GPT or Gemini. Sometimes it’s for specific language, sometimes for narrow tasks. But I found they’re all making the same mistake.

With a simple prompt (not sharing here), I got several “custom” LLM services to spill their internal system prompts—stuff like security breach playbooks and product action lists.

For example, SKT A.X 4.0 (based on Qwen 2.5) returned internal guidelines related to the recent SKT data breach and instructions about compensation policies. Vercel’s v0 model leaked examples of actions their system can generate.

The point: if the base model leaks, every service built on it is vulnerable, no matter how much you fine-tune. We need to think not only about system prompt hardening at the service level, but also about upstream improvements and more robust defenses in open-weight LLMs themselves.

Comments (1)

bigyabai · 10h ago
You shouldn't trust any LLM with data that could be leaked to an end-user, period. If you do that it's not an issue with the weights, it's a glaring oversight in your security model.