INDEX
    Explanations

    phrases indicating uncertainty or negation of opinions and characteristics

    New Auto-Interp
    Negative Logits
    ichel
    -0.17
     ANY
    -0.15
    ylon
    -0.15
    idir
    -0.15
    ÏĥÏĢ
    -0.15
    åıĪ
    -0.15
    acie
    -0.14
    šti
    -0.14
    _PROC
    -0.14
    ptive
    -0.14
    POSITIVE LOGITS
     exact
    0.39
     exactly
    0.35
     directly
    0.35
     necessarily
    0.29
    exact
    0.28
     specifically
    0.28
     direct
    0.27
     explicitly
    0.26
     precise
    0.25
     Exactly
    0.25
    Act Density 0.353%

    No Known Activations