INDEX
    Explanations

    mentions of the word "ub" with varying activation levels

    references to "ub" as a recurrent pattern or theme

    New Auto-Interp
    Negative Logits
     Lauder
    -0.78
     ORIG
    -0.72
     drift
    -0.67
     Atlantic
    -0.66
    alez
    -0.65
     Irma
    -0.63
    FUL
    -0.62
     Burnett
    -0.61
    backer
    -0.61
    agher
    -0.61
    POSITIVE LOGITS
    lishing
    1.26
    bing
    1.21
    rious
    1.15
    lisher
    1.12
    lique
    1.11
    bed
    1.10
    bles
    1.10
    bish
    1.10
    lish
    1.05
    ilant
    1.04
    Act Density 0.034%

    No Known Activations