INDEX
Negative Logits
neka
0.49
ו
0.47
kole
0.46
zid
0.46
gleichen
0.45
tipos
0.45
aceea
0.44
ודה
0.44
tipp
0.43
andra
0.43
POSITIVE LOGITS
EVP
0.47
externalities
0.45
allusion
0.43
overw
0.43
secrecy
0.42
overwritten
0.41
HW
0.41
introspection
0.41
VU
0.40
inhabit
0.40
Activations Density 0.000%