INDEX
Explanations
emotional dynamics and dependency in relationships
New Auto-Interp
Negative Logits
.IDENTITY
-0.15
ãģ¤ãģ¶
-0.14
retire
-0.14
ostel
-0.14
eken
-0.14
·æĸ°
-0.14
ëįķ
-0.14
defeat
-0.13
inspir
-0.13
(tuple
-0.13
POSITIVE LOGITS
ghost
0.24
Ghost
0.23
Ghost
0.22
hurt
0.21
friendship
0.21
ghost
0.21
toxicity
0.20
manip
0.20
Toxic
0.20
Distance
0.19
Activations Density 0.623%