INDEX
    Explanations

    expressions of praise or positive evaluation

    New Auto-Interp
    Negative Logits
    itr
    -0.16
    thouse
    -0.15
    afil
    -0.15
    yet
    -0.14
    ngen
    -0.14
    zap
    -0.14
    dech
    -0.14
     certo
    -0.14
     belli
    -0.14
     Goodman
    -0.14
    POSITIVE LOGITS
    s
    0.38
    -grand
    0.37
     deal
    0.32
    sword
    0.29
     deals
    0.28
    -value
    0.26
     dane
    0.24
    -looking
    0.24
    seller
    0.24
    ful
    0.23
    Act Density 0.041%

    No Known Activations