INDEX
Negative Logits
nebude
0.44
mismas
0.40
vanno
0.38
biais
0.38
terão
0.37
mismos
0.37
uso
0.36
stesse
0.36
gyro
0.36
conséquences
0.36
POSITIVE LOGITS
There
0.43
Chemical
0.41
题目
0.40
While
0.39
World
0.39
ujourd
0.38
The
0.38
title
0.38
When
0.38
prachtige
0.38
Activations Density 0.330%