INDEX
Negative Logits
Therefore
0.54
However
0.52
Nevertheless
0.51
Neither
0.50
Tuttavia
0.50
Nonetheless
0.50
\%$,
0.49
எனவே
0.48
Çünkü
0.47
しかし
0.46
POSITIVE LOGITS
popularity
0.45
dieses
0.41
fator
0.41
lather
0.41
father
0.40
धुन
0.40
ordeal
0.39
started
0.38
monogram
0.38
hvad
0.38
Activations Density 0.001%