INDEX
    Explanations

    mentions of people's names, specifically variations of the name "Ben"

    repeated mentions of the name "Ben."

    New Auto-Interp
    Negative Logits
    inarily
    -0.75
    REDACTED
    -0.74
    æĸ¹
    -0.74
    DragonMagazine
    -0.72
    mson
    -0.69
    perse
    -0.68
    åŃ
    -0.67
    ngth
    -0.67
    hower
    -0.65
    âĶģ
    -0.65
    POSITIVE LOGITS
    jamin
    1.50
    oit
    1.02
    nington
    1.02
    chers
    0.99
    cher
    0.96
    ches
    0.86
    utz
    0.84
    imaru
    0.83
    ghazi
    0.83
    isher
    0.82
    Act Density 0.015%

    No Known Activations