INDEX
Explanations
phrases related to video content that the user must watch, often signaling urgency or importance
references to videos
New Auto-Interp
Negative Logits
moth
-0.70
Rutherford
-0.64
arity
-0.64
Schwarz
-0.63
stone
-0.63
wedge
-0.62
Gibbs
-0.61
mend
-0.61
Byrne
-0.61
consolation
-0.61
POSITIVE LOGITS
Thumbnails
1.05
Videos
1.03
WATCHED
0.93
Helpful
0.80
olutions
0.79
rities
0.76
illon
0.74
iframe
0.73
Video
0.73
poons
0.73
Activations Density 0.014%