INDEX
Explanations
personal development and self-improvement-related phrases
references to the concept of self or personal identity
New Auto-Interp
Negative Logits
asio
-0.71
apo
-0.69
origin
-0.67
meat
-0.67
emis
-0.65
olid
-0.64
[+
-0.63
Rail
-0.62
economic
-0.62
Mub
-0.62
POSITIVE LOGITS
selves
0.98
tub
0.92
yourself
0.80
mate
0.77
guys
0.74
ocard
0.73
selves
0.73
yourselves
0.71
RS
0.70
ãģ§
0.66
Activations Density 0.017%