INDEX
    Explanations

    multiple-choice math questions

    New Auto-Interp
    Negative Logits
     DEA
    -0.08
     Daher
    -0.07
     regulator
    -0.07
     Already
    -0.07
     washed
    -0.07
    lana
    -0.07
     lavado
    -0.07
     esas
    -0.07
     محدود
    -0.07
     alive
    -0.07
    POSITIVE LOGITS
     concerne
    0.08
    Projectile
    0.08
    0.08
    тик
    0.07
    Text
    0.07
    कि
    0.07
     classic
    0.07
    Leaderboard
    0.07
     delectable
    0.07
    0.07
    Act Density 0.052%

    No Known Activations