INDEX
Explanations
phrases that express complex ideas or rhetorical questions
New Auto-Interp
Negative Logits
Trotzdem
-0.73
entanto
-0.68
nonetheless
-0.68
inoltre
-0.66
μως
-0.65
nevertheless
-0.65
moreover
-0.64
therefore
-0.64
also
-0.62
but
-0.62
POSITIVE LOGITS
Well
2.06
Well
2.03
well
1.96
well
1.77
WELL
1.49
WELL
1.46
Pues
1.14
Pues
1.07
Hmm
1.01
Umm
1.00
Activations Density 0.334%