INDEX
Explanations
discussions about loyalty and alliances within conflicts
New Auto-Interp
Negative Logits
lace
-0.15
duc
-0.15
orc
-0.15
tle
-0.14
IFI
-0.14
bard
-0.14
ensen
-0.14
olio
-0.14
ologne
-0.14
Replies
-0.13
POSITIVE LOGITS
sided
0.44
side
0.43
siding
0.42
sides
0.40
Side
0.36
side
0.36
-side
0.36
Side
0.34
-sided
0.33
_side
0.32
Activations Density 0.202%