INDEX
Explanations
expressions of gratitude and acknowledgments of contributions
New Auto-Interp
Negative Logits
orro
-0.16
Virt
-0.16
virt
-0.15
ND
-0.15
zel
-0.14
erfahren
-0.14
vor
-0.13
ÃĮ
-0.13
tee
-0.13
rous
-0.13
POSITIVE LOGITS
everyone
0.26
all
0.26
æīĢæľī
0.24
wszyst
0.22
semua
0.22
ãģĻãģ¹ãģ¦
0.21
everybody
0.20
everyone
0.20
جÙħÙĬع
0.18
tất
0.18
Activations Density 0.076%