INDEX
Explanations
references to emotional experiences and responses
New Auto-Interp
Negative Logits
emotion
-0.17
emoc
-0.17
emotion
-0.17
reta
-0.16
emotions
-0.16
okable
-0.16
inition
-0.15
pson
-0.14
Emotional
-0.14
лиÑĨ
-0.14
POSITIVE LOGITS
ality
0.24
intelligence
0.23
roller
0.21
charged
0.21
blackmail
0.20
Intelligence
0.20
attachment
0.20
regulation
0.19
Roller
0.19
ÑĨионалÑĮ
0.18
Activations Density 0.023%