INDEX
Explanations
concerns related to the safety and well-being of children
New Auto-Interp
Negative Logits
боÑĢ
-0.15
дав
-0.15
lád
-0.15
abor
-0.14
asar
-0.13
.simps
-0.13
phet
-0.13
¶Į
-0.13
stdClass
-0.13
бо
-0.13
POSITIVE LOGITS
safety
0.63
welfare
0.59
health
0.46
Welfare
0.46
Safety
0.45
wellbeing
0.45
Safety
0.44
afety
0.41
well
0.40
security
0.36
Activations Density 0.099%