INDEX
Explanations
dialogue and narrative progression
New Auto-Interp
Negative Logits
↵↵
1.36
Simply
1.29
Wherever
1.27
Whenever
1.26
Not
1.23
Even
1.20
Saying
1.19
Everyone
1.19
Despite
1.16
When
1.15
POSITIVE LOGITS
current
1.09
specific
1.06
verification
1.03
significant
1.02
continuous
1.01
Continuous
0.98
normalization
0.98
Linear
0.97
hours
0.95
additional
0.94
Activations Density 0.036%