INDEX
Explanations
phrases related to excuses and justifications for unacceptable behaviors
New Auto-Interp
Negative Logits
'-';↵
-0.15
ublik
-0.14
'/',↵
-0.14
maal
-0.14
ï½£
-0.13
ÑıкоÑģÑĤÑĸ
-0.13
'/')
-0.13
FieldValue
-0.13
("<-0.12
"-";↵
-0.12
POSITIVE LOGITS
"
0.42
«
0.36
'
0.34
`
0.28
“
0.27
ãĢĮ
0.26
="
0.26
\"
0.24
("0.23
``
0.23
Activations Density 0.210%