INDEX
Explanations
names of influential individuals in various fields
New Auto-Interp
Negative Logits
rette
-0.18
ipur
-0.18
ceptar
-0.15
ipop
-0.15
ะà¹ģ
-0.14
adro
-0.14
ÑĢиз
-0.14
unks
-0.13
porno
-0.13
riere
-0.13
POSITIVE LOGITS
‘
0.15
t
0.15
torn
0.15
â
0.14
cko
0.14
Ìģ
0.14
ze
0.14
FRA
0.14
kin
0.13
tm
0.13
Activations Density 0.348%