INDEX
Explanations
phrases starting with "But there are" or similar structures indicating contrasting information
discussions about restrictions and challenges in various contexts
New Auto-Interp
Negative Logits
but
-0.69
Tonight
-0.66
but
-0.64
However
-0.64
However
-0.63
kat
-0.62
tv
-0.61
fixme
-0.61
[|
-0.60
alde
-0.60
POSITIVE LOGITS
nonetheless
1.08
etheless
1.03
nevertheless
0.86
persisted
0.77
overshadowed
0.76
persists
0.75
undeniably
0.75
dogged
0.74
balk
0.73
elusive
0.73
Activations Density 1.015%