INDEX
Negative Logits
Й
-0.07
.uniform
-0.07
follows
-0.07
subscri
-0.07
Sc
-0.07
dict
-0.06
joins
-0.06
Dec
-0.06
persuaded
-0.06
contempt
-0.06
POSITIVE LOGITS
.car
0.07
(stderr
0.06
AFL
0.06
entlich
0.06
маль
0.06
преп
0.06
промислов
0.06
netinet
0.06
]='\
0.06
739
0.06
Activations Density 0.001%