INDEX
    Explanations

    math problems

    New Auto-Interp
    Negative Logits
     entdeckt
    -0.09
     ontdekt
    -0.09
     discoveries
    -0.09
    -0.08
    GAIN
    -0.08
     gone
    -0.07
     Govern
    -0.07
    discover
    -0.07
    _checked
    -0.07
    дағы
    -0.07
    POSITIVE LOGITS
    afi
    0.08
     parcel
    0.08
     mahal
    0.08
    leme
    0.07
    0.07
     تجاه
    0.07
    )!=
    0.07
     fie
    0.07
     china
    0.07
    kile
    0.07
    Act Density 0.018%

    No Known Activations