INDEX
Explanations
numerical quantities followed by a unit of measurement
instances of the word "runs" in various contexts
New Auto-Interp
Negative Logits
ether
-0.68
itive
-0.67
illus
-0.64
ase
-0.64
athering
-0.63
Crime
-0.63
atti
-0.62
iquette
-0.62
OME
-0.60
Letter
-0.59
POSITIVE LOGITS
runs
3.53
Runs
2.88
runs
2.53
run
2.22
ran
1.98
Run
1.80
run
1.70
Run
1.64
running
1.53
RUN
1.51
Activations Density 0.014%