INDEX
Explanations
themes of self-acceptance and personal empowerment
New Auto-Interp
Negative Logits
opa
-0.15
nu
-0.14
uty
-0.14
><?
-0.14
jer
-0.14
endo
-0.14
ze
-0.14
Glover
-0.14
ugen
-0.14
adr
-0.14
POSITIVE LOGITS
confidence
0.24
-confidence
0.23
confidence
0.23
Confidence
0.22
Self
0.22
self
0.20
SELF
0.19
Self
0.19
.self
0.19
confident
0.18
Activations Density 0.099%