INDEX
Negative Logits
Alem
0.38
Considerable
0.38
存在
0.36
mempers
0.36
一方で
0.35
Activate
0.35
ceff
0.35
downstream
0.35
수한
0.35
므로
0.34
POSITIVE LOGITS
honestly
0.44
препо
0.42
няколко
0.42
ᔕ
0.41
legit
0.41
mores
0.40
câteva
0.40
Honestly
0.40
algumas
0.39
sooo
0.38
Activations Density 0.039%