INDEX
    Explanations

    phrases indicating agreement or validation in various contexts

    New Auto-Interp
    Negative Logits
    å¡ļ
    -0.15
     Gro
    -0.15
    اء
    -0.14
    ULA
    -0.14
    aged
    -0.14
    ifer
    -0.14
    æĤ
    -0.14
     wil
    -0.14
    ums
    -0.14
     dist
    -0.14
    POSITIVE LOGITS
    ymoon
    0.16
    xit
    0.15
    Tween
    0.15
     vej
    0.15
    Declared
    0.15
    адж
    0.15
    ductor
    0.14
    ãĥ³ãĤ¯
    0.14
    upo
    0.14
    ksam
    0.14
    Act Density 0.119%

    No Known Activations