INDEX
Explanations
names of famous authors
New Auto-Interp
Negative Logits
ignment
-0.76
ition
-0.72
heed
-0.69
mith
-0.69
ulet
-0.68
rooms
-0.65
=-=-=-=-
-0.64
hur
-0.64
itol
-0.63
child
-0.62
POSITIVE LOGITS
Hoover
0.89
ALLY
0.87
swick
0.79
Cheong
0.78
Draper
0.77
Ernest
0.75
ally
0.73
Hem
0.72
Byrne
0.72
ATIONS
0.71
Activations Density 0.047%