INDEX
Explanations
instances of repetition and related terms in the text
New Auto-Interp
Negative Logits
erea
-0.15
akin
-0.15
ought
-0.14
yz
-0.14
Ùħز
-0.14
ye
-0.14
pop
-0.14
yard
-0.14
okol
-0.13
link
-0.13
POSITIVE LOGITS
ency
0.16
able
0.15
ìľ¨
0.14
ucci
0.14
ively
0.14
çı¾
0.14
ative
0.14
/single
0.14
ATTERN
0.14
μÏĢ
0.14
Activations Density 0.032%