INDEX
Negative Logits
supplement
0.54
Year
0.51
Aug
0.50
asistir
0.50
ise
0.48
assisted
0.47
Przed
0.47
ويت
0.47
oder
0.47
ALA
0.47
POSITIVE LOGITS
pathos
0.49
심
0.46
superficially
0.44
authorship
0.42
赞
0.42
今回
0.41
anek
0.41
communautés
0.41
Official
0.41
intercom
0.40
Activations Density 0.002%