INDEX
Explanations
mentions of the word "post" throughout the text
New Auto-Interp
Negative Logits
addin
-0.16
367
-0.14
vey
-0.14
_cond
-0.13
egg
-0.13
élé
-0.13
ãĥĭãĥĥãĤ¯
-0.13
profit
-0.13
xc
-0.13
zt
-0.13
POSITIVE LOGITS
šek
0.18
emark
0.15
.mousePosition
0.15
serter
0.15
Äįek
0.14
assi
0.14
coe
0.14
æ°ı
0.14
apon
0.14
å¥
0.14
Activations Density 0.015%