INDEX
Explanations
pronouns and demonstrative adjectives
New Auto-Interp
Negative Logits
aget
-0.15
silver
-0.14
lawful
-0.14
ifact
-0.14
shiny
-0.14
ussy
-0.14
osh
-0.13
full
-0.13
ĥ
-0.13
Answers
-0.13
POSITIVE LOGITS
zelf
0.17
agma
0.16
ViewState
0.15
abela
0.15
ayah
0.15
elop
0.14
UIFont
0.14
andel
0.14
arness
0.14
erver
0.14
Activations Density 0.011%