INDEX
    Explanations

    the start of a new section or paragraph in the document

    New Auto-Interp
    Negative Logits
    </table>
    -0.71
    ]-->
    -0.67
    )))
    
    -0.60
    }}}
    -0.58
    <bos>
    -0.57
    endforeach
    -0.57
     للاسماء
    -0.57
    )=>{
    
    -0.57
    }}}}
    -0.56
    }}}$
    -0.56
    POSITIVE LOGITS
     myſelf
    0.81
     reaſon
    0.80
     Theſe
    0.79
     Majefty
    0.79
     Anſ
    0.77
     purpoſe
    0.77
     itſelf
    0.76
     Dicapai
    0.73
    enumi
    0.73
    unknownFields
    0.72
    Act Density 0.030%

    No Known Activations