INDEX
Explanations
questions and concepts related to social justice, rights, and moral decisions
New Auto-Interp
Negative Logits
capucha
-0.57
miniaturka
-0.55
floresta
-0.54
elemField
-0.54
galinha
-0.48
amizade
-0.47
pecabe
-0.47
เอง
-0.47
itech
-0.46
AddHtmlAttribute
-0.46
POSITIVE LOGITS
Jîn
0.44
Rüyada
0.44
️
0.40
};*/
0.39
representative
0.39
socie
0.39
المعيارى
0.38
Denver
0.38
pap
0.36
Serv
0.36
Activations Density 0.029%