INDEX
Negative Logits
Work
0.80
Work
0.77
WORK
0.64
WORKS
0.55
workday
0.52
업무
0.52
trabajos
0.51
WORK
0.50
Werk
0.50
work
0.50
POSITIVE LOGITS
graft
0.55
Graft
0.51
grafted
0.45
GRA
0.44
gra
0.41
determinants
0.41
grafting
0.40
grafts
0.40
PARATION
0.40
word
0.39
Activations Density 0.003%