INDEX
Explanations
numerical patterns with specific sequences
instances of personal reflection or emotional experiences
New Auto-Interp
Negative Logits
atis
-0.77
extrad
-0.71
bloc
-0.70
sacked
-0.68
camoufl
-0.67
axe
-0.66
opposing
-0.65
allied
-0.65
arbit
-0.65
disav
-0.64
POSITIVE LOGITS
My
1.11
³³³³
1.04
³³³³³³³³³³³³³³³³
1.02
Anyway
1.01
Years
0.98
Thankfully
0.97
³³³
0.97
However
0.96
Sadly
0.95
MY
0.95
Activations Density 0.459%