INDEX
Explanations
phrases related to providing information or insight
the pronouns 'you' and 'us' in various contexts
New Auto-Interp
Negative Logits
uting
-0.68
Archdemon
-0.66
iffe
-0.64
ourgeois
-0.64
ute
-0.63
ilde
-0.63
rating
-0.61
annot
-0.61
Polit
-0.60
nown
-0.58
POSITIVE LOGITS
permission
1.19
pause
1.16
ample
1.15
insight
1.13
plenty
1.08
access
1.05
clues
1.01
glimps
1.00
insights
0.99
nightmares
0.97
Activations Density 0.082%