INDEX
Explanations
emotions or internal states related to the heart or mind
references to emotional or metaphorical concepts related to hearts and minds
New Auto-Interp
Negative Logits
onomy
-0.68
ression
-0.66
communism
-0.66
ressive
-0.63
heterogeneity
-0.60
ECH
-0.60
avoidance
-0.58
treatment
-0.58
withdrawal
-0.57
Recomm
-0.57
POSITIVE LOGITS
paces
1.58
chool
1.50
pace
1.45
creen
1.38
mith
1.34
pring
1.30
hare
1.27
peed
1.23
poons
1.21
hips
1.21
Activations Density 0.244%