INDEX
Explanations
conjunctions and phrases indicating contrasts or comparisons
New Auto-Interp
Negative Logits
Yet
-0.18
Yet
-0.18
chua
-0.17
thers
-0.17
HOWEVER
-0.15
yet
-0.15
elsewhere
-0.14
vice
-0.14
meanwhile
-0.14
GE
-0.14
POSITIVE LOGITS
بÙĦÚ©Ùĩ
0.27
sino
0.25
sondern
0.24
ampo
0.17
but
0.17
моÑĢ
0.16
että
0.15
lijah
0.15
odelist
0.14
.TestTools
0.14
Activations Density 0.021%