INDEX
Explanations
references to casualties or loss of life
New Auto-Interp
Negative Logits
ÏĥÏĦα
-0.07
ãģ¥
-0.07
imli
-0.07
armed
-0.07
igham
-0.07
ierz
-0.07
mam
-0.07
ãĥ³ãĥĦ
-0.07
Sachs
-0.07
æ¨
-0.07
POSITIVE LOGITS
losses
0.10
loss
0.09
loss
0.09
Loss
0.08
Loss
0.07
toll
0.07
Cas
0.07
cas
0.07
attr
0.07
injury
0.06
Activations Density 0.052%