INDEX
Explanations
paragraphs ending with a question or discussion prompt
sections of text that indicate a call to action or additional reading related to various topics
New Auto-Interp
Negative Logits
oun
-0.66
mbuds
-0.64
urch
-0.64
Morty
-0.63
inspecting
-0.62
ioch
-0.61
supervised
-0.60
contrace
-0.60
lifes
-0.59
shove
-0.59
POSITIVE LOGITS
Tags
0.83
Woman
0.75
Comments
0.74
Comment
0.73
Original
0.73
Recent
0.73
Watch
0.72
BOOK
0.69
âϦ
0.68
*
0.68
Activations Density 0.075%