INDEX
Explanations
phrases containing the word "but" in sentences
the word "but" indicating contrast or exception
New Auto-Interp
Negative Logits
tnc
-0.81
usting
-0.76
iggurat
-0.75
sense
-0.73
dayName
-0.67
edu
-0.66
代
-0.65
oire
-0.64
Laughs
-0.64
ivot
-0.63
POSITIVE LOGITS
declined
1.02
lacked
0.97
retains
0.96
alas
0.95
lacks
0.94
excludes
0.91
fortunately
0.91
nevertheless
0.91
luckily
0.90
unlike
0.88
Activations Density 0.166%