INDEX
Explanations
phrases indicating conditions, obligations, or specifications in legal or formal contexts
New Auto-Interp
Negative Logits
351
-0.16
D
-0.14
ele
-0.14
semb
-0.14
familiar
-0.13
872
-0.13
Boy
-0.13
alsy
-0.13
Vance
-0.13
ฤ
-0.13
POSITIVE LOGITS
agua
0.17
ex
0.16
ocado
0.16
TEE
0.16
uo
0.15
obe
0.15
εÏĦ
0.14
chop
0.14
abox
0.14
ãĤ¤ãĥĦ
0.14
Activations Density 0.060%