INDEX
    Explanations

    concepts related to foundational ideas and the education of undergraduate students

    New Auto-Interp
    Negative Logits
    izr
    -0.19
    antino
    -0.15
    mino
    -0.15
    ier
    -0.15
    zed
    -0.15
    ouch
    -0.14
     Rouge
    -0.14
    shops
    -0.14
    acula
    -0.14
    tec
    -0.14
    POSITIVE LOGITS
    neath
    0.23
    /down
    0.20
    lings
    0.17
    uates
    0.17
    whelming
    0.17
    574
    0.16
    warf
    0.16
    hill
    0.16
    IMS
    0.15
    graduate
    0.15
    Act Density 0.029%

    No Known Activations