INDEX
Explanations
proper nouns, specifically names of people and awards
New Auto-Interp
Negative Logits
STRU
-0.15
Reese
-0.15
Bilg
-0.14
GENCY
-0.14
gens
-0.14
otherwise
-0.14
Pron
-0.14
479
-0.13
uby
-0.13
verted
-0.13
POSITIVE LOGITS
owie
0.16
ovny
0.16
decision
0.16
DW
0.15
ç¸
0.15
DM
0.15
ream
0.15
Decision
0.15
ynom
0.15
_dm
0.14
Activations Density 0.030%