INDEX
Negative Logits
feu
0.47
зазна
0.46
weg
0.46
threads
0.46
phen
0.45
pyrazin
0.43
ling
0.43
round
0.43
tez
0.43
voja
0.43
POSITIVE LOGITS
electron
0.59
incentive
0.56
intestine
0.51
microscope
0.48
grape
0.48
incentives
0.47
desk
0.47
mirror
0.47
laurels
0.47
tobacco
0.46
Activations Density 0.000%