INDEX
Explanations
The neuron is looking for terms related to comparisons and changes in pace
phrases related to keeping up with progress or maintaining pace
New Auto-Interp
Negative Logits
rique
-0.67
ells
-0.64
ums
-0.63
ints
-0.62
ell
-0.62
int
-0.61
jun
-0.61
umb
-0.60
kinson
-0.60
ogi
-0.59
POSITIVE LOGITS
Behind
0.75
WARD
0.74
daq
0.73
ciating
0.73
Sharp
0.71
DragonMagazine
0.71
atana
0.70
lished
0.70
ilibrium
0.68
enance
0.67
Activations Density 0.329%