INDEX
Negative Logits
arbitrarily
0.91
adversarial
0.91
attempts
0.88
attempts
0.87
lmao
0.86
somewhat
0.86
ravariant
0.84
ㅋㅋ
0.84
attempt
0.84
Attempts
0.84
POSITIVE LOGITS
Our
1.62
Our
1.60
Discover
1.42
our
1.39
Discover
1.38
nuestros
1.33
discover
1.30
我們的
1.30
nuestro
1.29
nuestra
1.26
Activations Density 0.473%