INDEX
Explanations
terms expressing contradiction or exception
nevertheless
New Auto-Interp
Negative Logits
AnchorTagHelper
-0.71
Himo
-0.64
-0.60
AndEndTag
-0.60
OuterAlt
-0.58
kasarigan
-0.58
llary
-0.56
kaarangay
-0.54
IntoConstraints
-0.54
Aid
-0.52
POSITIVE LOGITS
却
1.96
卻
1.80
但却
1.23
却不
1.20
却是
1.13
却没有
1.11
却被
1.09
却又
1.04
però
0.81
justru
0.74
Activations Density 0.002%