INDEX
Explanations
contradictions or contrasting statements
New Auto-Interp
Negative Logits
zwar
-0.93
sice
-0.72
оригіналу
-0.68
übrigens
-0.68
even
-0.64
além
-0.62
だけでなく
-0.61
even
-0.60
はもちろん
-0.60
nejen
-0.60
POSITIVE LOGITS
nonetheless
1.97
nevertheless
1.90
dennoch
1.32
それでも
1.27
Nonetheless
1.22
trotzdem
1.20
Nevertheless
1.17
néanmoins
1.17
зато
1.16
ändå
1.16
Activations Density 0.372%