INDEX
    Explanations

    tokens that indicate structured formatting or elements in programming contexts

    Junk characters and non-English text

    specific numbers and punctuation

    New Auto-Interp
    Negative Logits
    transQ
    -0.82
    Hentet
    -0.80
     صوتيه
    -0.79
    󠁢
    -0.74
     "}";
    -0.67
     "]";
    -0.66
    AddTagHelper
    -0.62
    }),
    
    -0.61
    Legături
    -0.61
    >
    
    
    -0.59
    POSITIVE LOGITS
    OOTDTY
    0.61
    gebob
    0.60
    tvguidetime
    0.59
    complexContent
    0.59
    Földrajzportál
    0.57
    0.56
    mobileqq
    0.55
    AsStream
    0.55
     librement
    0.54
    qrstuvwxyz
    0.54
    Act Density 0.112%

    No Known Activations