INDEX
Explanations
phrases indicating a negative consequence or action
the conjunction "and" in various contexts
New Auto-Interp
Negative Logits
Powered
-0.72
doi
-0.67
atars
-0.63
Tid
-0.63
çīĪ
-0.60
Osw
-0.59
lift
-0.58
eatures
-0.56
Magnus
-0.55
owered
-0.55
POSITIVE LOGITS
nor
1.51
therefore
1.36
hence
1.33
consequently
1.30
furthermore
1.21
thus
1.18
secondly
1.09
moreover
1.06
yet
1.03
Therefore
0.94
Activations Density 0.284%