INDEX
    Explanations

    affirmations or acknowledgments of agreement

    New Auto-Interp
    Negative Logits
    _ASSUME
    -0.18
    nici
    -0.16
    strup
    -0.15
    owler
    -0.14
    à¹ĥà¸Ķ
    -0.14
    Ñģом
    -0.14
    uentes
    -0.14
    olvers
    -0.14
    欲
    -0.14
    uffles
    -0.14
    POSITIVE LOGITS
    AY
    0.34
     fine
    0.33
    ay
    0.31
    lahoma
    0.29
    ays
    0.29
    tober
    0.27
    fine
    0.25
     Fine
    0.25
     then
    0.23
     so
    0.23
    Act Density 0.030%

    No Known Activations