INDEX
Explanations
sentiments related to approval and criticism
New Auto-Interp
Negative Logits
fortun
-0.73
smoot
-0.73
Marketable
-0.73
photoc
-0.68
disrupted
-0.67
stomp
-0.66
anwhile
-0.65
outl
-0.64
Tanz
-0.64
synd
-0.64
POSITIVE LOGITS
Ļ
1.60
¬
1.16
į
1.11
ħ
1.11
¤
1.07
ª
1.04
ĵ
1.01
Ĵ
1.01
Ĺ
0.98
Ĩ
0.97
Activations Density 0.333%