INDEX
Negative Logits
some
0.52
what
0.51
another
0.48
a
0.48
becomes
0.48
algumas
0.47
跇
0.46
underpin
0.46
become
0.45
those
0.44
POSITIVE LOGITS
president
0.63
Federalist
0.57
restaurant
0.56
Himalayas
0.55
plaintiff
0.55
wealthiest
0.55
protagonist
0.54
defendant
0.53
Louvre
0.53
Dalai
0.53
Activations Density 0.012%