INDEX
Negative Logits
handicrafts
0.45
indoctr
0.44
childish
0.43
neutrinos
0.43
𒐪
0.42
upbringing
0.42
overfitting
0.42
figuratively
0.42
utensils
0.41
propositional
0.41
POSITIVE LOGITS
Past
0.42
P
0.41
JEN
0.41
ANG
0.40
Kons
0.39
M
0.39
O
0.39
Ak
0.38
Festival
0.38
W
0.38
Activations Density 0.224%