INDEX
Explanations
themes of emotional struggles and interpersonal dynamics
New Auto-Interp
Negative Logits
ivec
-0.15
ifer
-0.14
ique
-0.14
aki
-0.14
vak
-0.14
odic
-0.14
ugi
-0.14
bias
-0.14
.bias
-0.14
pag
-0.14
POSITIVE LOGITS
perfection
0.20
approval
0.19
approval
0.18
martyr
0.17
perfect
0.17
mük
0.17
Approval
0.16
Perfect
0.16
validation
0.16
好åĥı
0.16
Activations Density 0.289%