INDEX
Explanations
references to medical professionals or "Dr." titles
New Auto-Interp
Negative Logits
ip
-0.20
ened
-0.17
opl
-0.15
eler
-0.15
em
-0.15
iqu
-0.15
encias
-0.14
enville
-0.14
ex
-0.14
ias
-0.14
POSITIVE LOGITS
iven
0.23
illing
0.20
unken
0.19
inking
0.19
inks
0.19
infeld
0.19
ugged
0.18
aper
0.18
ont
0.16
umm
0.16
Activations Density 0.026%