INDEX
Negative Logits
:
0.77
susceptible
0.66
impairs
0.65
it
0.65
alleges
0.64
implies
0.63
pig
0.61
predecessor
0.61
he
0.61
architecture
0.61
POSITIVE LOGITS
Thank
0.96
Clar
0.89
?
0.89
؟.
0.88
Usuario
0.88
0.87
?)
0.87
Gracias
0.86
؟؟
0.85
Todo
0.85
Activations Density 0.079%