INDEX
Explanations
elements related to social and economic comparisons, focusing on degrees of excitement or quality of living
New Auto-Interp
Negative Logits
enough
-0.23
erah
-0.16
доÑģÑĤаÑĤоÑĩно
-0.16
ãģ¾ãģ¾
-0.15
chw
-0.15
íĦ°
-0.14
Enough
-0.14
very
-0.14
loud
-0.14
anymore
-0.14
POSITIVE LOGITS
than
0.96
than
0.83
-than
0.77
THAN
0.75
Than
0.73
Than
0.71
_than
0.70
niż
0.57
než
0.53
_THAN
0.50
Activations Density 0.482%