INDEX
    Explanations

    mentions of the word "Rozz" with an emphasis on activations of 9 or 10

    instances of the name "Rozz"

    New Auto-Interp
    Negative Logits
     Conrad
    -0.71
    ¥µ
    -0.69
     Polar
    -0.69
     lapse
    -0.69
     fitness
    -0.67
     Buffett
    -0.65
     conditioning
    -0.64
    croft
    -0.63
     appropriation
    -0.63
     foremost
    -0.62
    POSITIVE LOGITS
    zz
    1.16
    arella
    1.13
    ucc
    0.97
    azz
    0.92
    ZZ
    0.91
    etta
    0.91
    ella
    0.90
    abba
    0.90
    hou
    0.88
    ebra
    0.88
    Act Density 0.018%

    No Known Activations