INDEX
Explanations
conjunctions indicating relationships or connections within sentences
New Auto-Interp
Negative Logits
strict
-0.18
lica
-0.17
cona
-0.16
strict
-0.15
ica
-0.15
liberal
-0.15
mina
-0.15
ourg
-0.14
Strict
-0.14
STRICT
-0.14
POSITIVE LOGITS
astically
0.23
ensely
0.19
aneously
0.18
ensively
0.17
алÑĮно
0.17
trib
0.16
ynchronously
0.16
ÄįnÄĽ
0.16
ably
0.16
avÄĽ
0.16
Activations Density 0.098%