INDEX
Explanations
instances of ridicule and criticism, particularly related to social behaviors or choices
New Auto-Interp
Negative Logits
Chham
-0.62
ợp
-0.59
GEBURTS
-0.56
Escor
-0.56
Sequ
-0.55
zkiem
-0.53
Sequ
-0.53
pfung
-0.53
коменду
-0.52
Ando
-0.52
POSITIVE LOGITS
ridicule
1.19
mocking
1.15
mocked
1.09
mockery
1.05
laugh
1.02
mocks
0.96
ridiculed
0.96
laughing
0.95
scorn
0.95
contempt
0.95
Activations Density 0.388%