INDEX
Explanations
names of individuals associated with various accomplishments and stories
New Auto-Interp
Negative Logits
imens
-0.17
turnstile
-0.17
agens
-0.16
stery
-0.15
_consts
-0.14
ovna
-0.14
AMPL
-0.14
",__
-0.14
erosis
-0.14
оÑĢаÑı
-0.14
POSITIVE LOGITS
“
0.17
‘
0.16
’s
0.16
â̦
0.15
ko
0.14
cho
0.14
â̦↵
0.13
WI
0.13
is
0.13
”
0.13
Activations Density 0.119%