INDEX
    Explanations

    specific numeric or formatting patterns indicative of technical specifications or code structure

    New Auto-Interp
    Negative Logits
    :].
    -0.97
    )"),
    -0.91
    )");
    
    -0.89
    :]:
    -0.89
     referenties
    -0.87
    ."],
    -0.86
     باخ
    -0.86
    :],
    -0.85
     himo
    -0.82
    ']):
    -0.82
    POSITIVE LOGITS
    0
    1.58
     zero
    0.94
    zero
    0.90
     Zero
    0.88
     Zer
    0.85
     صفر
    0.84
    Zero
    0.84
     zéro
    0.82
     ZERO
    0.79
     cero
    0.77
    Act Density 0.921%

    No Known Activations