INDEX
Explanations
phrases that indicate struggles, inequalities, or challenges faced by individuals or communities
New Auto-Interp
Negative Logits
nez
-0.16
off
-0.15
ResourceId
-0.15
476
-0.15
resourceId
-0.15
inne
-0.14
èħķ
-0.14
itzer
-0.13
opens
-0.13
oons
-0.13
POSITIVE LOGITS
imei
0.16
naopak
0.16
rol
0.15
âm
0.15
çģ
0.15
convers
0.14
heels
0.14
atto
0.14
icol
0.14
ãĥ¼ãĤ¯
0.14
Activations Density 0.216%