INDEX
Negative Logits
ideales
0.90
Freude
0.89
Fähigkeiten
0.89
বন্ধু
0.84
ಆರೋಗ್ಯ
0.83
filosof
0.83
사회
0.83
myButtons
0.81
жизнь
0.80
социа
0.80
POSITIVE LOGITS
offending
2.02
problematic
1.78
troublesome
1.68
culprit
1.66
unruly
1.64
faulty
1.62
culprits
1.61
offenders
1.61
violating
1.61
objectionable
1.60
Activations Density 0.297%