INDEX
Explanations
sentences related to personal experiences and reflections
New Auto-Interp
Negative Logits
They
-0.70
)</
-0.68
idates
-0.66
Their
-0.65
ãħĭ
-0.63
Lot
-0.58
Says
-0.57
Alert
-0.57
eds
-0.56
doesnt
-0.56
POSITIVE LOGITS
myself
1.88
my
1.49
MY
0.90
My
0.85
My
0.82
ourselves
0.81
my
0.79
blogging
0.71
admittedly
0.69
naïve
0.66
Activations Density 1.005%