INDEX
    Explanations

    indicators of code structure or formatting

    New Auto-Interp
    Negative Logits
    تقاوى
    -0.76
    iſchen
    -0.71
    bibinfo
    -0.71
     queſta
    -0.71
    ロウィン
    -0.70
    <unused14>
    -0.69
    <unused16>
    -0.68
    <unused52>
    -0.68
    <unused8>
    -0.68
    [@BOS@]
    -0.68
    POSITIVE LOGITS
    0.38
     Himmel
    0.34
    ‌شده
    0.34
     same
    0.34
     Mosley
    0.33
    IntoConstraints
    0.32
    D
    0.32
                  
    0.32
     desaparecido
    0.31
     my
    0.31
    Act Density 0.014%

    No Known Activations