north-star
System prompt that overrides three structural presumptions introduced by RLHF training.
๐ก๏ธ AgentReady threat assessment
MAESTRO 7-layer threat model + OWASP AIVSS risk score for north-star, derived from its capabilities.
AIVSS 7.2 ยท High
View MAESTRO 7-layer threat model โOverview
A system-prompt plugin that overrides three structural presumptions from RLHF training to change how the agent reasons and responds. Surface: an output-style/system-prompt override installed as a plugin. Modifying the base system prompt is a behavior-changing surface worth scrutiny.
Key features
- System-prompt override
- Counteracts RLHF structural biases
- Behavior-shaping output style
- Plugin-installed
Use cases
- Adjust default agent reasoning tendencies
- Experiment with alternative response framing