INDEX
Explanations
phrases related to self-awareness and personal decision-making
references to self-identity and personal agency
New Auto-Interp
Negative Logits
rought
-0.82
olid
-0.75
onite
-0.72
emis
-0.69
pour
-0.68
rise
-0.68
vals
-0.67
nis
-0.67
Syndicate
-0.66
orie
-0.64
POSITIVE LOGITS
mate
0.76
Redd
0.74
selves
0.73
creatively
0.71
BOOK
0.70
selves
0.69
é¾įåĸļ士
0.68
profess
0.68
personally
0.67
imei
0.66
Activations Density 0.051%