INDEX
Negative Logits
Margins
0.76
(!_
0.75
Funny
0.72
File
0.65
Unsupported
0.65
কুল
0.64
*_
0.64
isNaN
0.63
平凡
0.63
Document
0.62
POSITIVE LOGITS
aware
0.84
她们
0.82
conscientes
0.77
entrees
0.76
ὥ
0.76
had
0.76
raging
0.76
종료
0.75
commencé
0.75
answered
0.75
Activations Density 0.002%