INDEX
    Explanations

    words related to intentionality or purposeful action

    words and phrases indicating intentionality or deliberate actions

    New Auto-Interp
    Negative Logits
    addons
    -0.85
    soon
    -0.79
     Parables
    -0.75
    esc
    -0.72
     Warrant
    -0.71
     Citation
    -0.69
    anon
    -0.69
    norm
    -0.69
    rose
    -0.68
    ĸļ
    -0.68
    POSITIVE LOGITS
     misleading
    0.87
     misrepresent
    0.84
     mislead
    0.80
     provoking
    0.80
     provocative
    0.79
     misled
    0.78
     sabot
    0.78
     obfusc
    0.78
     dece
    0.75
     sabotage
    0.75
    Act Density 0.026%

    No Known Activations