INDEX
Negative Logits
during
0.75
lungo
0.74
EVERY
0.73
perceived
0.72
durante
0.72
along
0.71
alongside
0.71
eaves
0.70
toward
0.70
बिफोर
0.70
POSITIVE LOGITS
ጠት
0.79
^{+}0.76
ס
0.74
সাহায্য
0.73
ቻል
0.72
某种
0.72
}^{-},0.70
寻找
0.70
獲
0.70
更好的
0.69
Activations Density 0.208%