INDEX
    Explanations

    mentions of the author Christopher Hitchens

    New Auto-Interp
    Negative Logits
    pires
    -0.80
    otype
    -0.70
    inent
    -0.70
    agin
    -0.66
    æ©Ł
    -0.65
    orld
    -0.63
    UTH
    -0.61
     Philos
    -0.59
    æĥ
    -0.59
     peacefully
    -0.58
    POSITIVE LOGITS
    ched
    1.29
    ches
    1.07
    boxes
    1.02
    box
    0.92
    tle
    0.88
    chens
    0.87
    achi
    0.83
    ting
    0.82
    ted
    0.81
    gerald
    0.79
    Act Density 0.641%

    No Known Activations