INDEX
Explanations
phrases indicating hesitation or conflict
the conjunction "but" indicating contrast or exception
New Auto-Interp
Negative Logits
roy
-0.71
ampa
-0.69
ump
-0.69
tnc
-0.65
entry
-0.65
agra
-0.64
venue
-0.64
aband
-0.63
segment
-0.62
built
-0.61
POSITIVE LOGITS
tons
1.19
alas
1.09
nevertheless
0.96
luckily
0.94
chery
0.93
hey
0.93
fortunately
0.88
nonetheless
0.87
THEN
0.76
unfortunately
0.76
Activations Density 0.233%