INDEX
Explanations
phrases indicating conditions and qualifications in statements
New Auto-Interp
Negative Logits
avigate
-0.17
iale
-0.17
istrovstvÃŃ
-0.16
кÑĢеÑĤ
-0.15
ibase
-0.15
éra
-0.15
ève
-0.15
pazar
-0.15
aminer
-0.15
queues
-0.14
POSITIVE LOGITS
means
0.34
virtue
0.31
dint
0.23
means
0.22
analogy
0.22
standards
0.20
-pass
0.20
Means
0.19
laws
0.19
mistake
0.19
Activations Density 0.250%