INDEX
Negative Logits
iture
-0.67
="#
-0.58
pse
-0.56
benefit
-0.56
worldly
-0.55
vec
-0.52
htaking
-0.51
chimpan
-0.49
advertisement
-0.49
handwritten
-0.49
POSITIVE LOGITS
onwards
1.07
onward
0.96
thereafter
0.75
rosso
0.67
.
0.67
additionally
0.66
;
0.61
again
0.61
completes
0.60
Lastly
0.59
Activations Density 0.253%