INDEX
    Explanations

    occurrences of specific characters or symbols

    New Auto-Interp
    Negative Logits
    ÂĿ
    -0.17
     ..."↵
    -0.14
    PIX
    -0.14
    `↵↵
    -0.14
    LOUR
    -0.13
     kind
    -0.13
    raç
    -0.13
    "`↵
    -0.13
     sort
    -0.13
    ;"↵
    -0.13
    POSITIVE LOGITS
     <
    0.48
    ,<
    0.39
    <
    0.38
    <br
    0.35
     </
    0.35
    .<
    0.35
    <i
    0.35
    <strong
    0.34
    <span
    0.34
     (<
    0.34
    Act Density 0.002%

    No Known Activations