INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    (+)
    0.38
    ாலம்
    0.37
    embedding
    0.37
     लची
    0.37
    0.37
     Stochastic
    0.37
     Async
    0.36
     (+)
    0.36
     शुभकामनाएं
    0.36
     lacus
    0.36
    POSITIVE LOGITS
     anger
    3.00
     angry
    2.88
    愤怒
    2.59
     angered
    2.55
     frustration
    2.44
     angrily
    2.44
     enraged
    2.44
    2.44
     Anger
    2.41
     frustrated
    2.36
    Act Density 0.106%

    No Known Activations