INDEX
Explanations
mentions of strong positive emotions related to personal memories or experiences
expressions of fondness related to memories
New Auto-Interp
Negative Logits
irrel
-0.71
DoS
-0.67
udder
-0.67
ulhu
-0.65
aminer
-0.64
helicop
-0.64
adesh
-0.64
DOWN
-0.64
pta
-0.63
IVER
-0.63
POSITIVE LOGITS
fond
1.10
uously
0.93
memories
0.91
est
0.88
iously
0.88
entimes
0.87
ties
0.86
ness
0.85
Memories
0.83
algia
0.83
Activations Density 0.020%