INDEX
Explanations
mentions of medical conditions or treatments
words related to restrictions or prohibitions
New Auto-Interp
Negative Logits
Trooper
-0.80
Apostles
-0.73
Primordial
-0.71
AGE
-0.71
seaf
-0.68
Dover
-0.67
forth
-0.66
Reynolds
-0.66
raught
-0.66
Crowley
-0.66
POSITIVE LOGITS
itor
1.23
itors
1.18
ited
1.08
iting
1.03
iencies
0.99
ctions
0.94
kered
0.94
acters
0.94
inant
0.94
itory
0.93
Activations Density 0.030%