INDEX
    Explanations

    instances of the word "don’t" and its variants, indicating a focus on negation or restrictions

    New Auto-Interp
    Negative Logits
    ãĤ®
    -0.14
    ures
    -0.14
    resa
    -0.13
    LES
    -0.13
    ll
    -0.13
    ta
    -0.13
    ieur
    -0.13
    les
    -0.13
    ="__
    -0.13
    æŃ²
    -0.13
    POSITIVE LOGITS
    't
    0.24
    'T
    0.19
    `t
    0.19
    +t
    0.17
    ’t
    0.17
    ot
    0.17
    nost
    0.17
    ;t
    0.16
    et
    0.16
    ´t
    0.16
    Act Density 0.091%

    No Known Activations