INDEX
    Explanations

    special characters and formatting indicators

    New Auto-Interp
    Negative Logits
    `;
    
    -0.56
    وأضاف
    -0.55
    География
    -0.54
    ]]
    
    -0.53
    abestanden
    -0.53
    -0.52
     comuniques
    -0.52
     ใหม่
    -0.51
    Bronnen
    -0.50
     tork
    -0.50
    POSITIVE LOGITS
    <bos>
    1.10
    DockStyle
    0.65
    hoeddwyd
    0.65
    الإنجليزية
    0.64
    zegovina
    0.57
    fillType
    0.57
     amitié
    0.55
    ConstraintMaker
    0.55
    PYX
    0.55
    下载附件
    0.54
    Act Density 0.255%

    No Known Activations