INDEX
Explanations
proper nouns, specifically names of people
New Auto-Interp
Negative Logits
cz
-0.70
simultane
-0.68
hur
-0.67
erest
-0.67
grounds
-0.63
CCP
-0.61
ournals
-0.60
ortium
-0.60
upd
-0.60
adolescence
-0.57
POSITIVE LOGITS
INAL
0.96
SEE
0.82
Lauder
0.69
Keane
0.69
JD
0.67
ENE
0.65
ESE
0.64
FACE
0.63
#$
0.62
Ü
0.62
Activations Density 0.625%