INDEX
Negative Logits
UC
-0.08
allas
-0.07
dah
-0.07
out
-0.07
gange
-0.07
idur
-0.07
FC
-0.07
�
-0.07
usehen
-0.07
lid
-0.07
POSITIVE LOGITS
responsabilidades
0.10
Responsibility
0.09
responsibilities
0.09
duties
0.09
🏼
0.09
naire
0.08
Verantwortung
0.08
맡
0.08
responsibility
0.08
TASK
0.08
Activations Density 0.008%