INDEX
Explanations
titles or mentions of individuals with the prefix "Dr." and their full name
mentions of medical professionals or doctors
New Auto-Interp
Negative Logits
chorus
-0.79
sled
-0.69
HUD
-0.69
CHAT
-0.69
actionGroup
-0.64
eering
-0.64
quart
-0.64
derby
-0.63
tune
-0.63
LOAD
-0.62
POSITIVE LOGITS
umin
1.09
inker
1.08
inks
1.07
ifts
1.02
unks
1.01
inking
0.97
udge
0.96
illing
0.96
ink
0.95
astically
0.94
Activations Density 0.021%