INDEX
Negative Logits
prescriptions
-0.09
incó
-0.08
inscrição
-0.08
провод
-0.08
Quiz
-0.08
substrates
-0.07
Enfer
-0.07
prescription
-0.07
enfer
-0.07
butterfly
-0.07
POSITIVE LOGITS
greed
0.08
eps
0.08
Lips
0.08
fulfilled
0.08
spite
0.08
-hearted
0.08
So
0.07
arski
0.07
appetite
0.07
greedy
0.07
Activations Density 0.005%