INDEX
Explanations
phrases indicating contrast or adversity
New Auto-Interp
Negative Logits
genius
-0.58
entweder
-0.55
tellations
-0.53
кож
-0.52
mags
-0.51
testens
-0.51
からです
-0.50
sowieso
-0.49
']='
-0.49
iecie
-0.49
POSITIVE LOGITS
Trotz
1.34
despite
1.26
ostante
1.22
Nevertheless
1.21
Despite
1.20
Despite
1.20
despite
1.20
spite
1.19
Nevertheless
1.17
Nonetheless
1.17
Activations Density 0.143%