INDEX
    Explanations

    math problems

    New Auto-Interp
    Negative Logits
     fr
    -0.09
     pist
    -0.08
    .fr
    -0.07
     frapp
    -0.07
    /fr
    -0.07
     flot
    -0.07
     kriz
    -0.07
    _fr
    -0.07
    (fr
    -0.07
    abi
    -0.07
    POSITIVE LOGITS
     stipulated
    0.10
     ours
    0.09
     stated
    0.09
     kosa
    0.08
    今回は
    0.08
    servez
    0.08
     그렇
    0.08
     우리는
    0.08
    úblic
    0.08
    specified
    0.08
    Act Density 0.061%

    No Known Activations