INDEX
Explanations
mentions of the name "Paul" in various contexts
New Auto-Interp
Negative Logits
alus
-0.17
orse
-0.17
oyo
-0.16
znik
-0.16
isos
-0.16
Airways
-0.16
esz
-0.16
nbr
-0.15
imu
-0.15
yk
-0.15
POSITIVE LOGITS
ine
0.31
sen
0.27
raj
0.24
ina
0.22
son
0.21
INE
0.20
ding
0.20
inho
0.20
inus
0.20
ie
0.19
Activations Density 0.013%