INDEX
Explanations
words related to culture and music
New Auto-Interp
Negative Logits
ernals
-0.18
anio
-0.16
Sonata
-0.15
ourn
-0.15
hiro
-0.15
Smoke
-0.15
ائز
-0.15
arto
-0.15
zzo
-0.15
isman
-0.15
POSITIVE LOGITS
siÄĻ
0.52
sich
0.48
zich
0.32
itself
0.31
ÑģÑı
0.29
themselves
0.29
sig
0.28
si
0.27
-se
0.26
ÑģебÑı
0.25
Activations Density 0.016%