INDEX
Explanations
discussions around controversial social issues and the justifications related to them
New Auto-Interp
Negative Logits
Attempting
-0.16
trys
-0.15
CompatActivity
-0.15
************************************************************************
-0.14
одаÑĢ
-0.14
stial
-0.14
èĤ©
-0.13
Pradesh
-0.13
attempting
-0.13
βε
-0.13
POSITIVE LOGITS
few
0.27
equipments
0.23
different
0.22
hundred
0.20
Few
0.20
Few
0.18
couple
0.18
différent
0.18
latest
0.17
thousand
0.17
Activations Density 0.858%