INDEX
Explanations
references to medical professionals, particularly doctors
references to medical professionals, specifically doctors
New Auto-Interp
Negative Logits
yss
-0.72
*/(
-0.67
cluding
-0.64
Huff
-0.63
issan
-0.62
AW
-0.62
CHAT
-0.62
ndra
-0.61
theless
-0.61
Flavoring
-0.61
POSITIVE LOGITS
doctor
1.09
doctor
1.05
physician
0.88
examiner
0.87
Doctor
0.86
practitioner
0.80
iate
0.78
abase
0.78
killer
0.78
Doctor
0.78
Activations Density 0.013%