INDEX
    Explanations

    occurrences of the letter 'U' in various contexts

    New Auto-Interp
    Negative Logits
    b
    -0.16
    t
    -0.16
    artz
    -0.15
    anced
    -0.15
    nul
    -0.15
    ico
    -0.14
    orry
    -0.14
     cons
    -0.14
    bish
    -0.14
    ses
    -0.14
    POSITIVE LOGITS
    trecht
    0.20
    igure
    0.19
    luÄŁ
    0.17
    rum
    0.16
    lan
    0.16
    zb
    0.16
    åŃ
    0.16
    ivar
    0.15
     Nolan
    0.15
    gro
    0.15
    Act Density 0.019%

    No Known Activations