INDEX
Explanations
references to military forces and conflicts
New Auto-Interp
Negative Logits
rosso
-0.20
anou
-0.16
quam
-0.15
eyim
-0.15
ascade
-0.15
yon
-0.14
USH
-0.14
rium
-0.14
laden
-0.14
iben
-0.14
POSITIVE LOGITS
å£
0.17
strength
0.14
equivalence
0.14
ox
0.14
SA
0.14
whistleblower
0.14
544
0.13
Bed
0.13
itech
0.13
Rent
0.13
Activations Density 0.115%