INDEX
    Explanations

    phrases that express certainty or emphasis

    phrases indicating certainty or frequency

    New Auto-Interp
    Negative Logits
    åĤ
    -0.71
     Zion
    -0.67
    umbn
    -0.67
    tnc
    -0.66
    ãĤ¶
    -0.66
    ãģķ
    -0.65
    ensis
    -0.65
    ãģ®éŃĶ
    -0.64
    aciously
    -0.64
    idated
    -0.64
    POSITIVE LOGITS
     importantly
    0.76
    entimes
    0.75
     kidding
    0.71
     referen
    0.70
    withstanding
    0.69
    Sounds
    0.68
    humans
    0.68
     Speaking
    0.66
     unsurprisingly
    0.66
     Negative
    0.65
    Act Density 0.102%

    No Known Activations