INDEX
Explanations
situations involving conflict or competition between parties
New Auto-Interp
Negative Logits
avian
-0.16
arters
-0.15
bach
-0.15
аÑĤков
-0.15
ãĥĨãĥ«
-0.14
raids
-0.13
itten
-0.13
chor
-0.13
ê¶Į
-0.13
»
-0.13
POSITIVE LOGITS
against
0.62
against
0.54
Against
0.49
Against
0.45
对
0.43
пÑĢоÑĤи
0.40
пÑĢоÑĤив
0.39
å°į
0.39
対
0.38
tegen
0.38
Activations Density 0.334%