INDEX
Explanations
numbers that indicate a reduction or decrease
phrases indicating a reduction or simplification
New Auto-Interp
Negative Logits
oris
-0.76
awn
-0.73
went
-0.73
mir
-0.71
ruly
-0.69
ear
-0.67
ris
-0.67
echoed
-0.66
rises
-0.65
hur
-0.64
POSITIVE LOGITS
ashes
0.98
scraps
0.94
rubble
0.93
zero
0.90
fractions
0.88
basics
0.88
earth
0.87
obscurity
0.87
shorts
0.87
mere
0.86
Activations Density 0.162%