INDEX
    Explanations

    instructions and queries

    text that contains explicit instructions, rules, or constraints directing the assistant's behavior (system prompts and policy-style directives).

    language that conveys formal task specifications—constraints, procedural instructions, policies, links/resources, templates/formats, and feature requirements.

    New Auto-Interp
    Negative Logits
    Incidentally
    0.41
    defn
    0.40
     doubtless
    0.39
     paltry
    0.38
    <unused303>
    0.38
     ostensibly
    0.38
     stalwart
    0.37
    <unused2049>
    0.36
    <unused267>
    0.36
    THRESH
    0.36
    POSITIVE LOGITS
     bellow
    0.61
    ´
    0.60
     advices
    0.56
     planification
    0.56
     lenght
    0.56
     ressources
    0.55
     Nowadays
    0.55
     wich
    0.53
     partecip
    0.52
     restauration
    0.52
    Act Density 0.073%

    No Known Activations