INDEX
Explanations
clauses or phrases expressing claims, decisions, or proposals
they and their actions
New Auto-Interp
Negative Logits
IntoConstraints
-0.55
GenerationType
-0.49
blessures
-0.43
mergeFrom
-0.40
-0.38
Slf
-0.38
เกิน
-0.36
žena
-0.35
mesi
-0.35
själv
-0.35
POSITIVE LOGITS
themselves
1.20
Their
1.11
Their
1.10
themselves
1.10
their
1.09
their
1.09
they
0.94
THEIR
0.93
they
0.92
mereka
0.92
Activations Density 0.763%