INDEX
Explanations
adverbs indicating frequency or degree
New Auto-Interp
Negative Logits
allerdings
-0.19
sice
-0.17
nt
-0.15
meanwhile
-0.15
HOWEVER
-0.15
ainsi
-0.14
however
-0.14
однако
-0.14
rne
-0.13
/w
-0.13
POSITIVE LOGITS
importantly
0.21
vice
0.19
/OR
0.18
some
0.18
vice
0.18
forth
0.16
/or
0.16
ebek
0.16
suffer
0.15
(?)
0.15
Activations Density 0.258%