INDEX
Explanations
terms indicating negative impacts or harmful effects
detrimental or detriment
New Auto-Interp
Negative Logits
kindness
-0.56
timacy
-0.55
doma
-0.55
Dapper
-0.54
Harsh
-0.54
agency
-0.54
Kindness
-0.54
Hardy
-0.54
hybrid
-0.53
CAPE
-0.53
POSITIVE LOGITS
detriment
0.59
nahilalakip
0.53
jälkeen
0.46
detrimental
0.45
relegated
0.45
enschappelijke
0.45
decoración
0.42
inmediatamente
0.42
ésult
0.42
Spoljašnje
0.41
Activations Density 0.011%