INDEX
    Explanations

    Math problem solving

    New Auto-Interp
    Negative Logits
     Alfa
    -0.08
    isex
    -0.08
     Cougar
    -0.08
     എന്ത
    -0.08
    -0.08
     sober
    -0.07
    -0.07
     ની
    -0.07
     为什么
    -0.07
     fierc
    -0.07
    POSITIVE LOGITS
     revolve
    0.09
     involving
    0.08
     ಸರ್ಕ
    0.08
    ham
    0.08
    ploy
    0.08
     Gug
    0.07
    Potential
    0.07
     applied
    0.07
     pertains
    0.07
    Tickets
    0.07
    Act Density 0.044%

    No Known Activations