INDEX
    Explanations

    phrases related to sudden intense events or actions

    descriptions of traumatic or violent events

    New Auto-Interp
    Negative Logits
    20439
    -0.85
    ãĤ´ãĥ³
    -0.81
     Achieve
    -0.72
     Inher
    -0.71
     annually
    -0.70
     Patreon
    -0.69
    yrights
    -0.69
     endeavors
    -0.68
     Architects
    -0.68
    ortium
    -0.67
    POSITIVE LOGITS
     screaming
    1.02
     ..."
    1.02
     panicked
    1.00
     yelling
    0.99
     â̦"
    0.97
     [
    0.95
     ['
    0.93
     luckily
    0.93
    ,'"
    0.90
     kinda
    0.90
    Act Density 0.451%

    No Known Activations