INDEX
Negative Logits
Them
0.33
them
0.30
There
0.29
esimerk
0.29
bukanlah
0.28
Vereinigte
0.28
Whatever
0.28
Yourself
0.27
Lots
0.26
상당
0.26
POSITIVE LOGITS
soever
0.66
exactly
0.55
they
0.53
exactly
0.52
beit
0.48
we
0.47
much
0.46
much
0.44
exatamente
0.44
else
0.43
Activations Density 0.255%