INDEX
Explanations
phrases related to decision-making processes and social issues
New Auto-Interp
Negative Logits
athan
-0.15
ÐĵÐŀ
-0.15
اضÛĮ
-0.15
ÛĮØ·
-0.15
arpa
-0.14
ĺħ
-0.14
antu
-0.14
quipment
-0.14
ibir
-0.13
.micro
-0.13
POSITIVE LOGITS
differently
0.17
style
0.16
oji
0.16
Kop
0.16
zk
0.16
style
0.15
anou
0.15
269
0.15
пÑĢав
0.15
et
0.14
Activations Density 0.248%