INDEX
    Explanations

    articles preceding nouns

    New Auto-Interp
    Negative Logits
    vn
    -0.18
    ero
    -0.17
    ose
    -0.16
    st
    -0.15
    lf
    -0.14
    isms
    -0.14
    zing
    -0.14
    atics
    -0.13
    ismo
    -0.13
    ism
    -0.13
    POSITIVE LOGITS
    lein
    0.19
     portion
    0.16
     portions
    0.16
     existed
    0.14
     further
    0.14
    ided
    0.14
    909
    0.14
     exists
    0.14
    upo
    0.14
    edis
    0.14
    Act Density 0.100%

    No Known Activations