INDEX
Explanations
emotional language related to affection or importance
terms that express affection or endearment
New Auto-Interp
Negative Logits
ioch
-0.77
ammers
-0.71
IDER
-0.70
AMS
-0.70
phrine
-0.70
UID
-0.67
Surv
-0.66
DERR
-0.65
RAFT
-0.64
ARC
-0.64
POSITIVE LOGITS
dear
1.36
dearly
0.99
departed
0.83
lord
0.79
uncle
0.73
acquaintance
0.72
friend
0.72
iously
0.72
beloved
0.71
iors
0.70
Activations Density 0.004%