INDEX
Explanations
references to the name "Alexander."
references to the name "Alexander."
New Auto-Interp
Negative Logits
neys
-0.78
zee
-0.76
rers
-0.74
ears
-0.74
etc
-0.73
fare
-0.70
eling
-0.67
glers
-0.67
bing
-0.66
cheon
-0.65
POSITIVE LOGITS
Luthor
0.85
Gust
0.84
Gaul
0.81
andr
0.80
opoulos
0.78
Hamilton
0.78
ander
0.77
iev
0.76
Payne
0.76
Berk
0.76
Activations Density 0.037%