INDEX
    Explanations

    the name "Bab" or variations of it specified in the activations

    occurrences of the name "Bab" and its variations

    New Auto-Interp
    Negative Logits
    backer
    -0.79
    PORT
    -0.73
    ICES
    -0.73
    IELD
    -0.69
    VICE
    -0.69
    dfx
    -0.69
    OPLE
    -0.68
    ICE
    -0.68
    MENT
    -0.68
     Lauder
    -0.67
    POSITIVE LOGITS
    alon
    1.02
    oru
    0.92
    cock
    0.91
    ulin
    0.90
    raham
    0.89
     Bab
    0.87
    oard
    0.87
    ule
    0.86
    aret
    0.86
    ulet
    0.84
    Act Density 0.018%

    No Known Activations