INDEX
Negative Logits
learning
0.51
traded
0.50
deaths
0.49
importance
0.48
importanza
0.48
gangenheit
0.46
uptick
0.46
relevant
0.46
reise
0.45
teaching
0.44
POSITIVE LOGITS
and
0.43
that
0.41
trabajar
0.40
Т
0.40
straighten
0.40
عنوان
0.40
works
0.39
ligation
0.39
arbejde
0.39
oglas
0.39
Activations Density 0.002%