INDEX
    Explanations

    punctuation marks, particularly semicolons and periods

    New Auto-Interp
    Negative Logits
     fran
    -0.76
    widetilde
    -0.72
    alan
    -0.67
    ondy
    -0.66
    ers
    -0.64
     of
    -0.64
    nungs
    -0.63
    P
    -0.62
    tps
    -0.62
     جه
    -0.62
    POSITIVE LOGITS
    $;
    1.44
    ;;;
    1.31
    ;;;;
    1.27
    AndEndTag
    1.22
    _;
    1.22
    icolon
    1.22
    +;
    1.19
    }$;
    1.18
    __;
    1.15
    ,:);
    1.14
    Act Density 0.213%

    No Known Activations