INDEX
Explanations
mentions of a specific name or surname
New Auto-Interp
Negative Logits
er
-0.20
ey
-0.19
jian
-0.16
on
-0.15
o
-0.14
æĿ¥æºIJ
-0.14
Nap
-0.14
Ext
-0.14
onne
-0.13
Selectors
-0.13
POSITIVE LOGITS
ibal
0.24
sville
0.17
Ĺ
0.16
ìłIJ
0.16
vá
0.16
ála
0.15
sylvania
0.15
quin
0.15
ecessary
0.15
ady
0.15
Activations Density 0.035%