INDEX
Explanations
instances of the word "subscribe."
New Auto-Interp
Negative Logits
obl
-0.17
uli
-0.16
uly
-0.15
ninger
-0.15
elin
-0.15
sten
-0.15
ads
-0.14
аÑĤо
-0.14
umann
-0.14
ches
-0.14
POSITIVE LOGITS
.unsubscribe
0.19
allee
0.16
.fd
0.15
ivate
0.15
¢åįķ
0.14
affles
0.14
ÑĥÑģ
0.14
iT
0.14
_userdata
0.14
=sub
0.13
Activations Density 0.008%