INDEX
Explanations
instances where the word "Start" is emphasized or repeated
instances of the word "Start."
New Auto-Interp
Negative Logits
conveyed
-0.64
aven
-0.64
uro
-0.63
graded
-0.62
concealed
-0.62
clad
-0.61
bos
-0.60
corrobor
-0.60
sustained
-0.59
unseen
-0.59
POSITIVE LOGITS
Start
3.65
Start
2.40
start
2.10
START
2.04
Starts
1.94
start
1.90
Begin
1.61
Startup
1.49
Starting
1.44
starting
1.38
Activations Density 0.012%