INDEX
Explanations
negative descriptors, particularly focusing on feelings of inadequacy and challenges
New Auto-Interp
Negative Logits
UTTON
-0.17
кав
-0.17
iso
-0.16
Bez
-0.15
utton
-0.15
inions
-0.15
pose
-0.15
itel
-0.14
alo
-0.14
ê»
-0.14
POSITIVE LOGITS
-Free
0.17
-free
0.17
.Restr
0.16
št
0.15
jokes
0.15
rosso
0.14
Pep
0.14
ECTOR
0.14
hack
0.14
nak
0.14
Activations Density 0.913%