INDEX
Explanations
affirmations or positive statements about experiences and accomplishments
New Auto-Interp
Negative Logits
meric
-0.17
alar
-0.15
ifer
-0.15
eper
-0.14
harmless
-0.14
surprise
-0.14
ron
-0.13
might
-0.13
othy
-0.13
vont
-0.13
POSITIVE LOGITS
pleasure
0.19
aju
0.17
Ple
0.17
award
0.16
privilege
0.16
fit
0.15
ignum
0.15
ç³»
0.14
grat
0.14
chai
0.14
Activations Density 0.092%