INDEX
    Explanations

    the letter 'a' in various contexts

    New Auto-Interp
    Negative Logits
    ager
    -0.15
    zia
    -0.14
    elle
    -0.14
    ling
    -0.14
    b
    -0.14
    ced
    -0.14
    ò
    -0.14
     fmt
    -0.13
    den
    -0.13
    oen
    -0.13
    POSITIVE LOGITS
    alley
    0.20
    à¹Ģม
    0.15
    lse
    0.15
    éru
    0.14
    istol
    0.14
    acific
    0.14
    ÏĦÏīν
    0.14
    ä¸Ķ
    0.13
    ulty
    0.13
    sand
    0.13
    Act Density 0.007%

    No Known Activations