INDEX
Explanations
proper nouns
repeated instances of the term "Cons" or variations thereof
New Auto-Interp
Negative Logits
WOOD
-0.77
Meadow
-0.71
Hoover
-0.68
Ames
-0.66
Amos
-0.66
scratch
-0.65
Afghans
-0.64
ILLE
-0.64
Gamb
-0.64
Wast
-0.63
POSITIVE LOGITS
ervatives
1.50
umers
1.49
idered
1.49
ensus
1.47
piracy
1.46
ensual
1.43
cientious
1.39
istent
1.38
olid
1.33
umed
1.31
Activations Density 0.025%