INDEX
Explanations
phrases and terms related to adherence to rules and regulations
New Auto-Interp
Negative Logits
eyn
-0.17
icina
-0.16
éĩİ
-0.16
iale
-0.16
amed
-0.15
elier
-0.15
Wes
-0.14
ÙħÛĮÙĦادÛĮ
-0.14
ature
-0.14
Away
-0.14
POSITIVE LOGITS
strictly
0.25
scr
0.24
fully
0.23
string
0.22
/non
0.21
Fully
0.19
Scr
0.18
fully
0.18
strict
0.18
Strict
0.17
Activations Density 0.026%