INDEX
    Explanations

    avoiding frustration or aggression

    New Auto-Interp
    Negative Logits
     அதிசய
    0.72
    神经网络
    0.72
     Vermeer
    0.68
     אור
    0.68
    𝄞
    0.67
     लागे
    0.67
    AppCompat
    0.67
    безпе
    0.67
     WorldCat
    0.66
    ocyte
    0.66
    POSITIVE LOGITS
     anger
    2.57
     angry
    2.53
     angrily
    2.13
     rage
    2.13
     aggressive
    2.11
     enraged
    2.11
     aggression
    2.09
     hostility
    1.97
     fury
    1.95
     angered
    1.92
    Act Density 1.936%

    No Known Activations