INDEX
Explanations
phrases related to a decrease or decline in something
occurrences of the word "down."
New Auto-Interp
Negative Logits
ature
-0.80
Us
-0.76
ilies
-0.75
velt
-0.72
¶ħ
-0.71
Loading
-0.68
rehens
-0.68
xtap
-0.68
atures
-0.67
olkien
-0.67
POSITIVE LOGITS
stairs
1.02
graded
0.98
stairs
0.86
grading
0.82
river
0.81
hill
0.80
played
0.79
horm
0.78
cast
0.75
linked
0.74
Activations Density 0.022%