INDEX
    Explanations

    expressions indicating certainty or likelihood of events

    phrases that express negation or disbelief regarding various situations or statements

    New Auto-Interp
    Negative Logits
    kefeller
    -0.74
     Citiz
    -0.65
    escription
    -0.64
    İĭ
    -0.60
    DragonMagazine
    -0.58
    apons
    -0.57
    anian
    -0.57
     pione
    -0.57
    artney
    -0.56
    isin
    -0.56
    POSITIVE LOGITS
    !
    1.08
     ðŁĻĤ
    1.01
    .
    1.00
     Nope
    0.99
    !!!!
    0.98
    !!!
    0.97
    !!
    0.93
    !.
    0.93
    *.
    0.90
    %.
    0.89
    Act Density 0.204%

    No Known Activations