INDEX
Explanations
mentions of specific universities
New Auto-Interp
Negative Logits
nett
-0.17
opher
-0.16
mani
-0.15
otti
-0.15
alice
-0.15
è·¡
-0.15
.obtain
-0.14
inez
-0.14
spo
-0.14
uhl
-0.14
POSITIVE LOGITS
ht
0.16
yro
0.15
)(((
0.15
.joda
0.15
(nt
0.14
à¥Īल
0.14
ASI
0.14
itmap
0.14
elif
0.14
yah
0.14
Activations Density 0.016%