INDEX
    Explanations

    mentions of people's names

    proper names and entities, particularly related to people

    New Auto-Interp
    Negative Logits
    stood
    -0.74
    ãĢIJ
    -0.70
    abiding
    -0.70
    ãĢİ
    -0.67
    $.
    -0.66
     commandments
    -0.63
     suits
    -0.60
     ITS
    -0.59
    represented
    -0.58
    profits
    -0.57
    POSITIVE LOGITS
    )
    0.85
     .)
    0.78
    )|
    0.78
     *)
    0.77
    )/
    0.76
     Photography
    0.74
    â̦)
    0.72
    )."
    0.72
     IMAGES
    0.72
    ,)
    0.72
    Act Density 0.269%

    No Known Activations