INDEX
Negative Logits
propone
0.47
tycker
0.41
acudir
0.40
мнению
0.40
}|$
0.40
👎
0.39
ವಿದೆ
0.39
削
0.38
垫
0.38
醮
0.38
POSITIVE LOGITS
complicated
0.62
peculiar
0.58
confusing
0.56
strange
0.55
strange
0.54
complicated
0.54
technically
0.53
confuses
0.52
awkward
0.51
weird
0.49
Activations Density 0.030%