INDEX
Explanations
contradictions or contrasting points within the text
New Auto-Interp
Negative Logits
namelijk
-0.61
よいよ
-0.58
zove
-0.54
いよいよ
-0.53
Called
-0.52
</caption>
-0.52
ordnen
-0.51
┻
-0.51
prostu
-0.50
Ouvrez
-0.50
POSITIVE LOGITS
nevertheless
0.97
nonetheless
0.93
still
0.88
but
0.82
dennoch
0.81
trotzdem
0.79
But
0.78
Nonetheless
0.77
Nonetheless
0.77
Nevertheless
0.76
Activations Density 0.322%