INDEX
    Explanations

    words related to personal experiences and explanations

    New Auto-Interp
    Negative Logits
     reluct
    -1.74
     indestru
    -1.71
     fta
    -1.69
     ?...
    -1.69
     emphat
    -1.69
     strick
    -1.69
     snoopy
    -1.68
     increa
    -1.67
     secon
    -1.66
     disagre
    -1.66
    POSITIVE LOGITS
    <bos>
    1.09
     nonetheless
    1.04
     nevertheless
    0.82
     ändå
    0.75
     anyway
    0.70
     enough
    0.63
     comunque
    0.62
    SystemColors
    0.60
    .
    0.59
     certainly
    0.59
    Act Density 0.760%

    No Known Activations