INDEX
Negative Logits
advocate
-0.08
ointments
-0.08
earthquake
-0.08
osp
-0.08
advocacy
-0.08
potens
-0.08
proc
-0.08
áz
-0.08
Women
-0.07
Validator
-0.07
POSITIVE LOGITS
koffie
0.08
Alltag
0.08
Initialize
0.08
처음
0.08
belge
0.08
коф
0.08
swirling
0.07
itele
0.07
เ
0.07
Chem
0.07
Activations Density 0.001%