INDEX
Explanations
names of academic institutions, particularly Johns Hopkins University
mentions of the name "Johns."
New Auto-Interp
Negative Logits
mble
-0.89
tto
-0.80
lier
-0.73
anza
-0.68
xual
-0.68
liest
-0.66
holding
-0.65
Rules
-0.65
edly
-0.62
e
-0.60
POSITIVE LOGITS
istry
0.93
Hopkins
0.90
insula
0.87
Johns
0.84
assian
0.82
sonian
0.82
olicited
0.81
otom
0.78
ynam
0.78
oc
0.77
Activations Density 0.033%