INDEX
    Explanations

    instances where a user is apologizing for not being knowledgeable about a particular topic

    repeated phrases conveying a sense of negation or absence

    New Auto-Interp
    Negative Logits
    uez
    -0.64
     weap
    -0.62
     Favor
    -0.60
    later
    -0.59
     Houses
    -0.58
    osion
    -0.58
     abound
    -0.57
    interstitial
    -0.55
     Bind
    -0.55
     Kills
    -0.55
    POSITIVE LOGITS
     gotten
    1.28
     been
    1.14
    gotten
    1.08
     figured
    1.06
    been
    1.05
     slept
    1.04
     bothered
    1.03
     forgotten
    1.01
     mastered
    1.00
     done
    0.99
    Act Density 0.080%

    No Known Activations