INDEX
Explanations
YouTube video links
occurrences of the word "watch" and related terms
New Auto-Interp
Negative Logits
capital
-0.66
Ling
-0.63
ax
-0.63
ground
-0.62
recomb
-0.61
sour
-0.60
Ram
-0.60
depreciation
-0.59
latent
-0.59
clust
-0.59
POSITIVE LOGITS
watch
4.60
Watch
2.09
watching
2.06
Watch
2.05
watch
1.99
WATCH
1.55
WATCH
1.53
watches
1.46
wat
1.30
Watching
1.13
Activations Density 0.006%