INDEX
Explanations
expressions of gratitude and conversation themes
New Auto-Interp
Negative Logits
.ua
-0.20
lds
-0.17
uros
-0.15
bject
-0.14
ulp
-0.14
Īëĭ¤
-0.14
iture
-0.14
gger
-0.14
oog
-0.14
ourg
-0.14
POSITIVE LOGITS
dem
0.30
mi
0.30
fi
0.28
di
0.27
nu
0.24
mek
0.23
pon
0.23
seh
0.22
tek
0.22
Mi
0.22
Activations Density 0.002%