INDEX
Negative Logits
ྀ
-0.94
BOTH
-0.93
ſi
-0.91
lää
-0.88
HER
-0.88
entemente
-0.88
mères
-0.88
uurs
-0.88
THING
-0.85
OND
-0.84
POSITIVE LOGITS
into
1.05
quando
0.97
/
0.94
to
0.89
period
0.85
anderen
0.84
after
0.83
ítez
0.83
when
0.82
躁
0.82
Activations Density 0.029%