INDEX
Explanations
proper nouns related to a person named Woods
mentions of the name "Woods."
New Auto-Interp
Negative Logits
otation
-0.73
LM
-0.70
itia
-0.68
ients
-0.68
utter
-0.67
ially
-0.65
inical
-0.64
ī
-0.64
usions
-0.64
activation
-0.64
POSITIVE LOGITS
hed
1.02
Hole
0.94
Woods
0.93
manship
0.90
boro
0.89
enegger
0.88
bury
0.87
schild
0.83
suit
0.81
hole
0.79
Activations Density 0.019%