INDEX
Explanations
particular phrases to prompt the reader to continue reading
instances of the word "reading" and its variations
New Auto-Interp
Negative Logits
»Ĵ
-0.80
ĪĴ
-0.79
ignt
-0.73
hap
-0.73
opard
-0.72
footed
-0.70
pora
-0.69
given
-0.68
Gleaming
-0.67
esville
-0.66
POSITIVE LOGITS
...]
0.80
toggle
0.75
WATCHED
0.70
til
0.65
âĨĴ
0.64
pauses
0.63
Expand
0.62
â̦]
0.61
unsupported
0.61
presses
0.61
Activations Density 0.011%