INDEX
Explanations
introducing contrast though
New Auto-Interp
Negative Logits
justru
0.55
Despite
0.52
Unfortunately
0.48
Despite
0.48
nawet
0.47
居然
0.47
bla
0.46
EVEN
0.44
mesmo
0.43
несмотря
0.43
POSITIVE LOGITS
although
1.56
though
1.50
although
1.45
хотя
1.38
though
1.33
aunque
1.33
embora
1.30
choć
1.30
Хотя
1.13
虽然
1.13
Activations Density 0.193%