INDEX
Explanations
expressions related to effort and commitment to tasks or responsibilities
New Auto-Interp
Negative Logits
stället
-0.35
incidents
-0.34
pengar
-0.34
kesempatan
-0.33
kohta
-0.33
pemberian
-0.31
informacji
-0.31
informace
-0.30
μφωνα
-0.30
decisiones
-0.30
POSITIVE LOGITS
damage
0.77
damage
0.77
homework
0.73
Damage
0.72
dirty
0.69
justice
0.69
Damage
0.65
Espèce
0.65
thing
0.65
DAMAGE
0.64
Activations Density 0.153%