INDEX
Explanations
the presence of specific terms associated with spiritual or religious practices
New Auto-Interp
Negative Logits
وتسجيلات
-1.07
الرياضيه
-1.03
Datuak
-1.01
AssemblyTitle
-1.00
gynhyrchwyd
-0.99
doInBackground
-0.98
FileDescriptor
-0.96
Portail
-0.95
المعرف
-0.94
expandindo
-0.93
POSITIVE LOGITS
<h3>
0.79
<h2>
0.67
<blockquote>
0.62
*
0.61
↵↵
0.61
0.60
*
0.57
[toxicity=0]
0.57
<h5>
0.55
↵↵↵
0.55
Activations Density 0.032%