INDEX
Explanations
proper nouns, particularly focusing on names
the name "Richards" in various contexts
New Auto-Interp
Negative Logits
phis
-1.01
phas
-0.91
anooga
-0.86
unity
-0.74
unal
-0.72
emort
-0.72
ivated
-0.72
cue
-0.71
ocular
-0.69
efully
-0.68
POSITIVE LOGITS
Richards
0.88
cream
0.75
holder
0.73
zman
0.72
iders
0.72
hips
0.71
buster
0.70
DF
0.68
burgh
0.67
Hof
0.66
Activations Density 0.029%