INDEX
Explanations
terms related to military personnel and their organization
New Auto-Interp
Negative Logits
ï¸ı
-0.15
elden
-0.15
tle
-0.15
lisi
-0.14
iddy
-0.14
terrorism
-0.14
ilter
-0.13
sext
-0.13
eway
-0.13
heid
-0.13
POSITIVE LOGITS
/nav
0.21
station
0.17
men
0.17
al
0.17
camps
0.16
é§IJ
0.16
deployed
0.16
ieri
0.16
士
0.16
anzi
0.16
Activations Density 0.050%