INDEX
    Explanations

    patterns of repetition and frequency in various contexts

    New Auto-Interp
    Negative Logits
     First
    -0.24
    First
    -0.19
    ricks
    -0.17
     nearest
    -0.17
     é¦ĸ
    -0.15
    oda
    -0.15
     closest
    -0.15
    lyn
    -0.15
     FIRST
    -0.14
    ÑĢик
    -0.14
    POSITIVE LOGITS
     third
    0.60
     fourth
    0.59
     fifth
    0.56
     sixth
    0.54
    third
    0.52
     second
    0.49
     THIRD
    0.48
     seventh
    0.48
    第ä¸ī
    0.46
     eighth
    0.44
    Act Density 0.115%

    No Known Activations