INDEX
Negative Logits
ORPG
-0.64
Balanced
-0.59
uggest
-0.58
Fired
-0.57
Ended
-0.57
Binding
-0.55
Explan
-0.55
corn
-0.55
Flavoring
-0.52
Respons
-0.52
POSITIVE LOGITS
albeit
1.19
uh
1.07
alas
1.04
however
0.96
um
0.95
unsurprisingly
0.84
namely
0.81
respectively
0.81
moreover
0.80
albeit
0.77
Activations Density 0.995%