INDEX
Explanations
references to media criticism and parody
New Auto-Interp
Negative Logits
ystore
-0.16
Pod
-0.14
paperback
-0.14
缣
-0.14
erli
-0.14
Pod
-0.14
Newsletter
-0.13
Enrollment
-0.13
yš
-0.13
ubar
-0.13
POSITIVE LOGITS
viral
0.35
meme
0.32
vir
0.29
memes
0.29
Mem
0.28
Vir
0.28
mem
0.27
mem
0.23
VIR
0.23
MEM
0.23
Activations Density 0.060%