INDEX
Explanations
words indicating causation or emphasis in statements
New Auto-Interp
Negative Logits
allerdings
-0.18
meanwhile
-0.17
sice
-0.17
however
-0.15
HOWEVER
-0.15
однако
-0.15
nt
-0.14
/w
-0.13
totiž
-0.13
横
-0.13
POSITIVE LOGITS
importantly
0.21
vice
0.20
forth
0.18
vice
0.17
some
0.17
ebek
0.16
/OR
0.16
ìĿ¸ì§Ģ
0.14
(?)
0.14
ä¸ĢäºĽ
0.14
Activations Density 0.220%