INDEX
Explanations
mentions of the name "Nick."
New Auto-Interp
Negative Logits
opt
-0.17
udo
-0.15
hower
-0.15
ullah
-0.15
heimer
-0.14
ufe
-0.14
umble
-0.14
aced
-0.14
atus
-0.14
oven
-0.14
POSITIVE LOGITS
laus
0.31
olas
0.27
las
0.23
named
0.23
olls
0.22
odem
0.22
names
0.22
olson
0.21
olet
0.20
292
0.19
Activations Density 0.011%