INDEX
Explanations
terminology related to evaluations and judgments in various contexts, especially concerning assessments and incidents of violence
New Auto-Interp
Negative Logits
inders
-0.20
etting
-0.18
ales
-0.17
OrDefault
-0.17
dür
-0.16
dden
-0.16
assistance
-0.15
ìĶ©
-0.15
isher
-0.15
innacle
-0.15
POSITIVE LOGITS
ively
0.26
ive
0.23
ments
0.22
ors
0.20
(es
0.20
itude
0.19
ment
0.17
ä»¶
0.17
urance
0.17
hole
0.16
Activations Density 0.099%