INDEX
Explanations
phrases indicating negative outcomes or disappointments
instances of the word "Unfortunately."
New Auto-Interp
Negative Logits
reference
-0.67
rank
-0.66
cluster
-0.64
mobile
-0.62
count
-0.61
fresh
-0.61
separate
-0.61
union
-0.59
relation
-0.58
white
-0.58
POSITIVE LOGITS
Unfortunately
3.03
Unfortunately
2.71
Sadly
2.66
Sadly
2.31
Thankfully
1.90
Fortunately
1.89
Luckily
1.77
Fortunately
1.77
Thankfully
1.75
Alas
1.69
Activations Density 0.024%