INDEX
    Explanations

    phrases related to intentions and reasons

    phrases indicating certainty, intent, and lack of evidence

    New Auto-Interp
    Negative Logits
    ahime
    -0.79
    pherd
    -0.65
    ortment
    -0.62
    tsky
    -0.62
    iets
    -0.61
    igs
    -0.61
    visor
    -0.60
    types
    -0.60
    acerb
    -0.60
    ork
    -0.60
    POSITIVE LOGITS
     whatsoever
    1.84
     nor
    1.15
     anymore
    0.95
     anywhere
    0.85
     hesitation
    0.80
     anybody
    0.77
     except
    0.75
     slightest
    0.75
     EVER
    0.74
     necessarily
    0.74
    Act Density 0.169%

    No Known Activations