INDEX
Explanations
contrasting conjunctions that indicate opposition or alternative perspectives
New Auto-Interp
Negative Logits
không
-0.16
nicht
-0.15
=?
-0.15
caps
-0.14
ENTA
-0.13
ä¸įèĥ½
-0.13
niet
-0.13
Bold
-0.13
evin
-0.13
æīįèĥ½
-0.13
POSITIVE LOGITS
rather
0.66
Rather
0.56
rather
0.55
Rather
0.53
plutôt
0.39
eher
0.26
بÙĦÚ©Ùĩ
0.25
spÃŃÅ¡e
0.25
instead
0.20
sondern
0.19
Activations Density 0.051%