INDEX
Explanations
adjectives with positive connotations placed before the word "but," suggesting a contrast or unexpected element
the word "but" used in contrasting statements
New Auto-Interp
Negative Logits
ento
-0.90
oire
-0.87
ORPG
-0.67
eatures
-0.67
itness
-0.66
asions
-0.63
OLOGY
-0.63
utonium
-0.62
axter
-0.62
ogens
-0.62
POSITIVE LOGITS
nonetheless
1.16
nevertheless
1.09
tons
1.08
chery
0.99
still
0.91
unmist
0.89
manageable
0.89
effective
0.88
still
0.86
undeniably
0.85
Activations Density 0.088%