INDEX
Explanations
phrases indicating problems or issues related to societal challenges
New Auto-Interp
Negative Logits
opard
-0.15
æľīä»Ģä¹Ī
-0.15
ãģĿãģĹãģ¦
-0.15
ur
-0.14
alc
-0.14
femin
-0.14
име
-0.14
adj
-0.13
urs
-0.13
isha
-0.13
POSITIVE LOGITS
tw
0.20
:
0.19
despite
0.18
while
0.17
apt
0.16
although
0.16
once
0.16
simple
0.15
omorphic
0.15
è¿Ļæł·çļĦ
0.15
Activations Density 0.065%