INDEX
Explanations
conjunctions, specifically the word "but"
the word "but" and its variations to identify contrasting ideas or transitions in thought
New Auto-Interp
Negative Logits
uto
-0.69
ump
-0.68
roy
-0.68
uly
-0.67
ige
-0.65
UD
-0.65
amus
-0.64
tnc
-0.64
illed
-0.63
agra
-0.63
POSITIVE LOGITS
tons
1.22
alas
1.03
fortunately
0.97
chery
0.96
luckily
0.94
nevertheless
0.92
unfortunately
0.87
beware
0.84
nonetheless
0.83
thankfully
0.81
Activations Density 0.209%