INDEX
Explanations
references to innocence
the repeated mention of the term "innocent."
New Auto-Interp
Negative Logits
anwhile
-0.92
Recomm
-0.85
KEY
-0.76
ingo
-0.74
orders
-0.73
ridor
-0.72
TOP
-0.70
ension
-0.70
è¦ļéĨĴ
-0.69
ETA
-0.68
POSITIVE LOGITS
innocent
1.28
bystand
1.24
bystanders
1.15
innocence
1.03
ocent
0.94
innoc
0.83
minded
0.79
civilians
0.77
Innocent
0.72
mole
0.71
Activations Density 0.008%