INDEX
Negative Logits
undai
-0.83
destro
-0.70
necessities
-0.68
conduc
-0.68
plur
-0.65
utilitarian
-0.65
uter
-0.65
mber
-0.64
ogram
-0.64
restraints
-0.64
POSITIVE LOGITS
@#&
1.57
#$
1.23
?!
1.16
@#
1.10
:)
1.05
ãĢį
1.02
[/
1.00
;)
0.97
ðŁĺ
0.96
:-)
0.94
Activations Density 0.313%