INDEX
Explanations
references to famous individuals and their works
New Auto-Interp
Negative Logits
elden
-0.16
ouis
-0.16
iž
-0.15
nict
-0.15
.joda
-0.14
.ua
-0.14
ewis
-0.14
ette
-0.14
assi
-0.14
žen
-0.14
POSITIVE LOGITS
Dav
0.24
Burn
0.23
af
0.22
Af
0.21
Sark
0.20
Portable
0.18
Af
0.18
dav
0.18
Ti
0.18
Afro
0.18
Activations Density 0.012%