INDEX
Explanations
references to military forces and their nationalities
New Auto-Interp
Negative Logits
ç¹Ķ
-0.17
ãĥ«ãĥķ
-0.15
eft
-0.14
ÑģÑĤÑĢой
-0.14
yne
-0.14
è¦ļ
-0.14
_UL
-0.14
pand
-0.14
inke
-0.13
LIK
-0.13
POSITIVE LOGITS
arias
0.17
ovsky
0.16
GM
0.15
andan
0.14
hazi
0.14
Pent
0.14
arty
0.14
errat
0.13
úb
0.13
cur
0.13
Activations Density 0.038%