INDEX
Explanations
words related to oppression and social injustice
New Auto-Interp
Negative Logits
بالنسبة
-0.52
__('-0.52
))]
-0.50
})]
-0.48
jsPsych
-0.48
AxisAlignment
-0.45
colate
-0.44
merkungen
-0.44
ilton
-0.44
RegressionTest
-0.43
POSITIVE LOGITS
hoping
2.18
thinking
1.98
believing
1.87
expecting
1.72
knowing
1.66
fearing
1.58
Hoping
1.57
thinking
1.54
Hoping
1.51
intending
1.46
Activations Density 0.832%