INDEX
Explanations
proper nouns or names of individuals
occurrences of the word "had."
New Auto-Interp
Negative Logits
Conclusion
-0.67
ellen
-0.65
Square
-0.64
Mothers
-0.61
awa
-0.61
uating
-0.61
ickle
-0.61
coin
-0.60
DF
-0.60
FW
-0.59
POSITIVE LOGITS
been
1.32
begun
1.21
gotten
1.20
flown
1.17
gone
1.12
taken
1.10
withdrawn
1.07
undergone
1.07
previously
1.06
eaten
1.04
Activations Density 0.190%