INDEX
Explanations
phrases indicating criticism or condemnation of exploitation and dishonesty
New Auto-Interp
Negative Logits
profesional
-0.55
actores
-0.52
Actors
-0.49
actors
-0.49
Music
-0.49
concerts
-0.48
profissionais
-0.48
Representation
-0.47
professional
-0.46
profesionales
-0.46
POSITIVE LOGITS
propOrder
0.88
صوتيه
0.86
TestBed
0.86
Искәрмәләр
0.82
تانيه
0.80
חיצוניים
0.77
ьаж
0.76
synth
0.76
disambiguazione
0.75
+#+#
0.74
Activations Density 0.113%