INDEX
    Explanations

    affirmations and expressions of agreement

    New Auto-Interp
    Negative Logits
    ect
    -0.21
     Doch
    -0.15
    eb
    -0.15
    ogne
    -0.15
    celik
    -0.14
    ão
    -0.14
    tn
    -0.14
     quand
    -0.14
    andon
    -0.14
    nt
    -0.14
    POSITIVE LOGITS
     sure
    0.20
    yeah
    0.20
    sure
    0.20
    emek
    0.17
    redient
    0.17
     right
    0.17
    Yeah
    0.17
    tember
    0.17
    quake
    0.16
    ARB
    0.16
    Act Density 0.015%

    No Known Activations