INDEX
    Explanations

    punctuation marks or numeric symbols

    New Auto-Interp
    Negative Logits
    LLocation
    -1.02
    ContentAsync
    -0.92
     utafitiHapana
    -0.92
    HostException
    -0.89
     itſelf
    -0.88
     الرياضيه
    -0.86
    felves
    -0.86
     ―――――
    -0.85
    存于互联网档案馆
    -0.85
    ſelves
    -0.84
    POSITIVE LOGITS
    ,
    0.76
    "
    0.69
    \
    0.65
    (
    0.61
    .
    0.60
     (
    0.59
    ;
    0.57
     ,
    0.56
     "
    0.55
    <bos>
    0.55
    Act Density 0.192%

    No Known Activations