INDEX
Explanations
expressions related to pride, work, and community objectives
New Auto-Interp
Negative Logits
zung
-0.19
ardy
-0.17
ilon
-0.17
tar
-0.16
imals
-0.15
ìĦ
-0.14
pics
-0.14
LOAT
-0.14
LD
-0.14
cht
-0.14
POSITIVE LOGITS
ollen
0.17
омен
0.16
oller
0.14
βο
0.14
Decompiled
0.14
SAME
0.14
į
0.14
itution
0.14
ãĤ§
0.14
ancode
0.14
Activations Density 0.341%