INDEX
Explanations
expressions of regret or reflection on personal experiences
New Auto-Interp
Negative Logits
orum
-0.73
unin
-0.73
chance
-0.68
arcity
-0.68
alist
-0.62
tackle
-0.60
WATCHED
-0.59
exclusive
-0.58
allas
-0.58
undercut
-0.58
POSITIVE LOGITS
My
1.04
My
0.95
I
0.85
my
0.81
Hi
0.79
myself
0.78
Writing
0.76
MY
0.76
Dear
0.74
Growing
0.73
Activations Density 0.982%