INDEX
    Explanations

    proper nouns ending with "bert."

    New Auto-Interp
    Negative Logits
     lock
    -0.70
     swap
    -0.68
     swapped
    -0.65
     hereafter
    -0.65
     end
    -0.63
     Chain
    -0.63
     yours
    -0.62
     HI
    -0.62
     haven
    -0.62
     marathon
    -0.62
    POSITIVE LOGITS
    bert
    4.63
    berto
    1.71
    bern
    1.62
    bart
    1.46
    berman
    1.43
    ber
    1.41
    enegger
    1.24
    BER
    1.23
    berger
    1.21
    bard
    1.19
    Act Density 0.015%

    No Known Activations