INDEX
    Explanations

    phrases that indicate uncertainty or speculation

    New Auto-Interp
    Negative Logits
     sÄħ
    -0.15
    ä¸Ģæł·
    -0.14
    339
    -0.14
    @js
    -0.14
    ä¸Ģèµ·
    -0.14
    á¿¶
    -0.14
    same
    -0.13
    Same
    -0.13
    è»
    -0.13
    sla
    -0.13
    POSITIVE LOGITS
     nobody
    0.35
     few
    0.29
     none
    0.29
     everyone
    0.26
     many
    0.23
     everybody
    0.22
     no
    0.22
     anyone
    0.21
     def
    0.21
     NONE
    0.21
    Act Density 0.215%

    No Known Activations