INDEX
Explanations
phrases that indicate increase or rise in contexts related to inequality or disparity
New Auto-Interp
Negative Logits
Ñģон
-0.15
mars
-0.14
ÑĮ
-0.14
anka
-0.14
PartialView
-0.14
Leer
-0.14
eurs
-0.14
ibur
-0.14
Nx
-0.14
NA
-0.13
POSITIVE LOGITS
Ary
0.16
ноÑģи
0.15
pron
0.15
loh
0.15
McCoy
0.14
assi
0.14
ibling
0.14
füh
0.14
auge
0.14
ins
0.13
Activations Density 0.018%