INDEX
    Explanations

    words related to comparison or evaluation

    phrases indicating excessiveness or negative evaluations

    New Auto-Interp
    Negative Logits
    elin
    -0.77
     showc
    -0.73
    orter
    -0.64
    ayn
    -0.64
     uninterrupted
    -0.63
    alid
    -0.62
    licts
    -0.62
    eret
    -0.62
    origin
    -0.61
    Nar
    -0.61
    POSITIVE LOGITS
     coincidence
    0.69
     anymore
    0.65
     guessed
    0.64
    icable
    0.62
     bothering
    0.61
     Smoking
    0.60
    ables
    0.60
    fy
    0.59
     coinc
    0.58
     Fancy
    0.58
    Act Density 0.131%

    No Known Activations