INDEX
Explanations
proper nouns, specifically names of individuals
recurring mentions of the name "Ann"
New Auto-Interp
Negative Logits
Scouting
-0.65
dare
-0.65
questioning
-0.63
passage
-0.59
typing
-0.59
scratching
-0.59
gym
-0.58
scene
-0.58
dogs
-0.58
Combine
-0.58
POSITIVE LOGITS
ouncing
1.61
ihilation
1.60
ounces
1.53
ounced
1.52
ihil
1.51
ounce
1.46
iversary
1.39
unciation
1.39
apolis
1.34
abel
1.30
Activations Density 0.024%