INDEX
Explanations
personal growth and positive affirmation-related words and phrases
expressions of emotional states and interactions with people and surroundings
New Auto-Interp
Negative Logits
confir
-0.72
predec
-0.70
ij士
-0.62
notor
-0.60
proport
-0.59
Instr
-0.59
ãĥ¼ãĥĨãĤ£
-0.58
Recomm
-0.58
¥ŀ
-0.57
conclud
-0.56
POSITIVE LOGITS
huh
0.97
!!!!
0.96
!?
0.95
!
0.91
!!!
0.90
!:
0.89
!!
0.89
?!
0.89
?
0.85
!!!!!!!!
0.82
Activations Density 0.711%