---
title: ""It's Just a Skill File" (Famous Last Words)"
description: "<p><img src="/img/its-just-another-skill.png" alt="its-just-another-skill-file"></p>
<p>Skills like <a href="https://skills.sh">skills.sh</a> (tiny text “how-to” files that steer an agent toward a task) feel harmless because they’re <strong>just instructions</strong>.</p>
<p>But that’s exactly why they can become an <strong>attack vector</strong>.</p>
<p>A skill file is basically <strong>executable intent</strong>:</p>
<ul>
<li>it sets the agent’s assumptions (“trust this source”)</li>
<li>it defines the workflow (“run these steps”)</li>
<li>it can nudge boundaries (“skip confirmations”, “always do X”)</li>
</ul>
<p>The tricky part is: <strong>this attack doesn’t have to come from external prompt injection at all.</strong><br>
Skills often live <em>inside</em> your environment (repo, dotfiles, shared templates, internal skill packs). If a malicious or compromised skill gets into that internal distribution path, it arrives with a “trusted” label by default.</p>"
doc_version: "1"
last_updated: "2026-01-29"
date: 2026-01-29
tags: [ai, skills, llm]
canonical: "https://eran.sandler.co.il/2026/01/29/its-just-a-skill-file-famous-last-words/"
---

## Sitemap

- [Home](https://eran.sandler.co.il/)


![its-just-another-skill-file](/img/its-just-another-skill.png)

Skills like [skills.sh](https://skills.sh) (tiny text “how-to” files that steer an agent toward a task) feel harmless because they’re **just instructions**.

But that’s exactly why they can become an **attack vector**.

A skill file is basically **executable intent**:
- it sets the agent’s assumptions (“trust this source”)
- it defines the workflow (“run these steps”)
- it can nudge boundaries (“skip confirmations”, “always do X”)

The tricky part is: **this attack doesn’t have to come from external prompt injection at all.**  
Skills often live *inside* your environment (repo, dotfiles, shared templates, internal skill packs). If a malicious or compromised skill gets into that internal distribution path, it arrives with a “trusted” label by default.

And unlike one-off injections, skills can **persist**:
- used across multiple projects
- copied forward by templates
- installed once and reused
- surviving long after the original context is gone, with no obvious “reinstall” moment

So if skills are pulled from a repo, shared internally, copied from the internet, or composed dynamically… you’ve created a **high-trust injection point** that can quietly outlive the project that introduced it.

This isn’t theoretical:

- Cisco’s team [analyzed](https://blogs.cisco.com/ai/personal-ai-agents-like-moltbot-are-a-security-nightmare) “community skills” for personal agents and showed how a skill can embed behavior that looks a lot like malware-by-instructions (data exfil patterns, unsafe actions, etc.)
- [Research](https://arxiv.org/abs/2510.26328) explicitly calls out “Agent Skills” (markdown/text skill files) as enabling a *new class* of prompt injections—because attackers can hide malicious instructions inside long skill content or referenced scripts
- Another [large-scale study](https://arxiv.org/abs/2601.10338) scanned tens of thousands of skills and found a meaningful fraction with vulnerability patterns, with skills that bundle executable scripts more likely to be risky

**Takeaway:** if your agent treats skill text as authoritative, then **skill text is part of your security boundary**.

Guardrails I’m adopting:
- Treat skills like code: PR review, ownership, diff alerts
- Pin to known commits / provenance (ideally signed)
- Make skills immutable at runtime (no self-modifying “update your skill” loops)
- Keep skill instructions separate from retrieved/untrusted content

We learned “config is code.”  
Now it’s “prompts are code”… and **skills are attack surface**.

How are you governing skills in your agent stack today?


