INDEX
Explanations
the name "Elli" with different activations, most prominently with the value 10
proper nouns, specifically names of individuals
New Auto-Interp
Negative Logits
citiz
-0.69
unden
-0.68
Gork
-0.66
exha
-0.65
atchewan
-0.65
sg
-0.64
NOR
-0.64
rir
-0.63
Malays
-0.63
phans
-0.62
POSITIVE LOGITS
quist
1.01
opsis
0.89
ocl
0.86
umen
0.81
zzo
0.81
otherapy
0.79
isance
0.78
ancing
0.76
ocular
0.76
uci
0.74
Activations Density 0.013%