INDEX
Negative Logits
他說
0.43
vuole
0.40
piensan
0.39
voulez
0.38
chcesz
0.38
savent
0.38
quiser
0.38
haría
0.37
quieres
0.37
bunu
0.37
POSITIVE LOGITS
belongs
0.49
belong
0.46
consists
0.44
represents
0.44
serves
0.43
belonged
0.43
comprises
0.42
được
0.41
被
0.40
belong
0.40
Activations Density 0.381%