INDEX
    Explanations

    repeated characters or patterns in text

    New Auto-Interp
    Negative Logits
    ete
    -0.16
    avy
    -0.16
    owa
    -0.15
    enko
    -0.15
    opa
    -0.15
    ering
    -0.15
    aging
    -0.15
    oci
    -0.15
    awa
    -0.15
    ings
    -0.14
    POSITIVE LOGITS
    Ñĥди
    0.25
    ÑĢг
    0.20
    на
    0.19
    лÑĮÑĤ
    0.17
    Ñĥк
    0.17
    trak
    0.17
    ÑĢиÑģÑĤ
    0.17
    моÑĢ
    0.16
    ним
    0.16
    ÑĢаб
    0.16
    Act Density 0.009%

    No Known Activations