INDEX
    Explanations

    repeated special characters or icons in the text

    New Auto-Interp
    Negative Logits
    aliz
    -0.15
    afil
    -0.14
    olie
    -0.14
     experience
    -0.14
    |h
    -0.14
    alian
    -0.14
     Shuffle
    -0.13
    ết
    -0.13
    izza
    -0.13
    toy
    -0.13
    POSITIVE LOGITS
    оген
    0.19
     San
    0.15
     Heb
    0.15
     Texas
    0.15
    .Constraint
    0.15
     Hel
    0.14
     PE
    0.14
     TEX
    0.14
    ogen
    0.14
     Hero
    0.14
    Act Density 0.010%

    No Known Activations