INDEX
    Explanations

    references to zigzag patterns

    New Auto-Interp
    Negative Logits
     Ara
    -0.17
    åķ
    -0.16
    airs
    -0.15
    ough
    -0.15
    lw
    -0.15
    rade
    -0.15
     fak
    -0.14
     Arbor
    -0.14
    ceptor
    -0.14
    ière
    -0.14
    POSITIVE LOGITS
    ç¯ī
    0.15
    /access
    0.14
     headline
    0.14
    uate
    0.14
    ey
    0.14
    çİĩ
    0.14
    ekyll
    0.14
    arr
    0.13
    479
    0.13
    619
    0.13
    Act Density 0.008%

    No Known Activations