INDEX
Explanations
references to soldiers and military-related terms
New Auto-Interp
Negative Logits
bindValue
-0.15
oningen
-0.14
mitt
-0.14
inou
-0.14
ï¸ı
-0.14
Atlantis
-0.14
à¸ı
-0.14
obierno
-0.14
rane
-0.13
robe
-0.13
POSITIVE LOGITS
士
0.18
men
0.16
-training
0.16
camps
0.16
station
0.15
iple
0.15
stationed
0.15
barracks
0.15
training
0.15
deployed
0.15
Activations Density 0.070%