INDEX
Explanations
phrases related to personal connections and emotions
mentions of the word "loved" in various contexts
New Auto-Interp
Negative Logits
SPONSORED
-0.96
arta
-0.88
arat
-0.80
illin
-0.78
interstitial
-0.78
agher
-0.76
Dispatch
-0.72
arb
-0.70
oteric
-0.70
arian
-0.68
POSITIVE LOGITS
dearly
0.95
loved
0.88
uncond
0.84
loving
0.77
Loving
0.74
joy
0.74
loves
0.73
nesday
0.73
ĸļ
0.70
itely
0.70
Activations Density 0.009%