INDEX
Negative Logits
an
0.80
pipe
0.65
questions
0.65
китай
0.63
z
0.62
a
0.61
v
0.60
Bill
0.59
quotes
0.59
Telegraph
0.59
POSITIVE LOGITS
цаў
0.70
ोन
0.60
ιών
0.58
紛
0.58
trajectories
0.57
administrations
0.57
及
0.57
separations
0.57
biasing
0.57
脲
0.57
Activations Density 0.003%