INDEX
    Explanations

    the presence of specific characters or sequences within words

    New Auto-Interp
    Negative Logits
    ager
    -0.17
    ce
    -0.16
    lane
    -0.16
    lag
    -0.16
    ne
    -0.16
    anna
    -0.16
    loff
    -0.16
    itage
    -0.16
    soever
    -0.15
    enced
    -0.15
    POSITIVE LOGITS
    upal
    0.17
    letic
    0.17
    erif
    0.16
    ãi
    0.15
     través
    0.14
    alet
    0.14
    oog
    0.14
    resi
    0.14
     partir
    0.14
    éro
    0.14
    Act Density 0.022%

    No Known Activations