INDEX
Explanations
pronouns in sentences where the subject is being observed or addressed
references to personal relationships and interactions
New Auto-Interp
Negative Logits
ģĸ
-0.68
icion
-0.65
ãĤ¦ãĤ¹
-0.63
creation
-0.63
é¾
-0.62
oso
-0.62
elsen
-0.62
Questions
-0.60
osate
-0.60
osi
-0.59
POSITIVE LOGITS
smiling
1.00
naked
0.99
alive
0.95
interact
0.93
perform
0.91
behaving
0.89
dancing
0.88
bleed
0.86
unfold
0.86
interacting
0.84
Activations Density 0.151%