INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.07
    cite
    -0.06
     ماسه
    -0.06
    Highlighted
    -0.06
    	    
    -0.06
    ovie
    -0.06
     başarılı
    -0.06
     capsule
    -0.06
     development
    -0.06
    prefer
    -0.06
    POSITIVE LOGITS
     anger
    0.17
     angry
    0.15
     Angry
    0.10
     fury
    0.10
     rage
    0.09
     angered
    0.09
     angrily
    0.08
     furious
    0.08
     enraged
    0.08
    0.07
    Act Density 0.012%

    No Known Activations