INDEX
Explanations
expressions of opinion or preference related to products or experiences
New Auto-Interp
Negative Logits
elsewhere
-0.17
someone
-0.15
åĪ¥
-0.15
occasionally
-0.14
periodically
-0.14
sometimes
-0.14
åı¦
-0.13
another
-0.13
à¸ļาà¸ĩ
-0.13
Sometimes
-0.13
POSITIVE LOGITS
æīĢæľī
0.72
all
0.70
every
0.68
wszyst
0.64
ãģĻãģ¹ãģ¦
0.63
semua
0.61
모ëĵł
0.60
everything
0.60
вÑģеÑħ
0.57
todas
0.55
Activations Density 1.118%