INDEX
    Explanations

    mentions of the name "Todd" or similar words related to it

    New Auto-Interp
    Negative Logits
    <eos>
    -0.50
     Hiller
    -0.42
     Schar
    -0.41
    ↵↵
    -0.40
    ar
    -0.39
     Blume
    -0.38
    er
    -0.38
     Backman
    -0.37
    ilber
    -0.37
     Korn
    -0.37
    POSITIVE LOGITS
    Todd
    2.19
     Todd
    2.09
    todd
    1.66
     todd
    1.53
     TOD
    1.29
     Toddler
    1.02
    Tod
    1.02
     myſelf
    0.99
     Tod
    0.98
     toddlers
    0.97
    Act Density 0.004%

    No Known Activations