INDEX
Negative Logits
워
1.00
忽然
0.98
최초
0.94
突然
0.91
的重要
0.91
noteworthy
0.91
territory
0.90
Territory
0.89
contoh
0.88
명
0.88
POSITIVE LOGITS
simpler
1.17
instead
1.15
instead
1.15
lieber
1.14
unaffected
1.08
preservar
1.08
únicamente
1.07
calmer
1.06
あくまで
1.02
Instead
1.00
Activations Density 1.308%