INDEX
Negative Logits
under
-0.89
side
-0.75
that
-0.73
ohyd
-0.73
of
-0.73
abdomen
-0.71
nicht
-0.70
نمی
-0.70
ghter
-0.69
永恒
-0.69
POSITIVE LOGITS
Told
1.03
told
0.98
Told
0.79
TRUST
0.77
hängen
0.77
THEORY
0.76
Escola
0.75
Iva
0.74
iva
0.74
zwungen
0.74
Activations Density 0.040%