INDEX
Explanations
prepositional phrases describing downward movement
occurrences of the word "the"
New Auto-Interp
Negative Logits
replace
-0.72
thood
-0.70
qt
-0.70
craft
-0.67
anan
-0.66
mate
-0.64
namely
-0.63
SPONSORED
-0.62
cé
-0.60
taker
-0.60
POSITIVE LOGITS
entire
1.32
remainder
1.18
proverbial
1.13
whole
1.11
entirety
1.08
same
1.07
fastest
1.02
hardest
1.02
slightest
0.98
smallest
0.97
Activations Density 0.416%