INDEX
    Explanations

    math problems

    New Auto-Interp
    Negative Logits
     instruction
    -0.08
     Presenter
    -0.08
    ربة
    -0.08
    -0.07
     presenter
    -0.07
     handler
    -0.07
     Trees
    -0.07
    -0.07
    ixir
    -0.07
     दाख
    -0.07
    POSITIVE LOGITS
     fark
    0.09
    nger
    0.08
     млад
    0.08
     acuer
    0.08
     perempuan
    0.08
     aged
    0.08
     widers
    0.08
     шығарм
    0.08
    akoa
    0.08
     девуш
    0.08
    Act Density 0.115%

    No Known Activations