INDEX
Explanations
contrasting conjunctions and phrases that indicate complexity or nuance in relationships between ideas
New Auto-Interp
Negative Logits
not
-0.35
NOT
-0.33
không
-0.29
не
-0.27
nicht
-0.27
ä¸į
-0.26
niet
-0.26
Not
-0.24
not
-0.24
tidak
-0.24
POSITIVE LOGITS
rather
0.39
rather
0.32
Rather
0.32
Rather
0.30
بÙĦÚ©Ùĩ
0.25
actually
0.25
sondern
0.22
importantly
0.20
plutôt
0.19
quite
0.18
Activations Density 0.098%