INDEX
Negative Logits
easier
-0.54
']))
-0.54
exhibition
-0.53
***!
-0.52
internetowa
-0.52
مشين
-0.51
佗
-0.51
reciprocal
-0.51
//</
-0.51
houding
-0.51
POSITIVE LOGITS
soever
0.59
PreferredItem
0.51
woofer
0.50
0.49
ides
0.48
Chaucer
0.48
ping
0.48
pad
0.48
did
0.47
piecze
0.46
Activations Density 0.002%