INDEX
Explanations
phrases related to scrolling down for video content
indicators of media and content consumption
New Auto-Interp
Negative Logits
imaru
-0.67
beit
-0.66
territ
-0.66
endale
-0.66
ciating
-0.65
olor
-0.65
includ
-0.64
accordingly
-0.64
gnu
-0.64
sacrific
-0.63
POSITIVE LOGITS
Trend
0.80
Same
0.79
Success
0.70
Prin
0.69
':
0.66
Kiw
0.66
Trans
0.66
Failure
0.65
'[
0.65
Safe
0.65
Activations Density 0.060%