INDEX
Explanations
phrases indicating a significant event or decline
occurrences of the word "down."
New Auto-Interp
Negative Logits
ilies
-0.80
Us
-0.77
¶ħ
-0.70
iliary
-0.70
ature
-0.67
digy
-0.64
velt
-0.64
iler
-0.63
jamin
-0.61
anooga
-0.61
POSITIVE LOGITS
stairs
1.10
graded
1.04
grading
0.90
stairs
0.90
river
0.85
horm
0.84
LOAD
0.80
hill
0.73
hill
0.73
wind
0.72
Activations Density 0.032%