INDEX
Explanations
prominent names of influential individuals
New Auto-Interp
Negative Logits
hete
-0.16
bul
-0.16
bur
-0.16
rewrite
-0.16
334
-0.16
089
-0.15
iggers
-0.15
Dün
-0.14
Ñĸг
-0.14
_vc
-0.14
POSITIVE LOGITS
ascade
0.17
fty
0.15
quadr
0.15
ser
0.15
triple
0.14
sil
0.14
nock
0.14
-scalable
0.13
Cyril
0.13
iciel
0.13
Activations Density 0.042%