INDEX
Explanations
proper nouns, specifically names of individuals
New Auto-Interp
Negative Logits
åĬŁ
-0.15
Booth
-0.15
starter
-0.14
exemplary
-0.13
egral
-0.13
::
-0.13
esty
-0.13
äºĭåĭĻ
-0.13
something
-0.13
odd
-0.13
POSITIVE LOGITS
ermann
0.15
ayah
0.14
uke
0.14
Shuffle
0.14
okino
0.13
grily
0.13
etim
0.13
жовÑĤ
0.13
ukes
0.13
andles
0.13
Activations Density 0.391%