INDEX
    Explanations

    affirmations and expressions of agreement

    New Auto-Interp
    Negative Logits
    ès
    -0.16
    ogene
    -0.15
    acker
    -0.15
     Gro
    -0.15
    aggio
    -0.15
     Ober
    -0.15
    Gro
    -0.14
    ÙĨا
    -0.14
     Mog
    -0.14
    amas
    -0.14
    POSITIVE LOGITS
    boro
    0.16
    fully
    0.15
    ersh
    0.15
    ffa
    0.14
    idian
    0.14
    ÙĪÙĨÙĬØ©
    0.14
    orth
    0.14
    ARGET
    0.14
    ¦¬
    0.14
    edii
    0.14
    Act Density 0.036%

    No Known Activations