INDEX
    Explanations

    proper nouns related to a specific entity or individual

    mentions of a specific person's name

    New Auto-Interp
    Negative Logits
     dec
    -0.69
     clean
    -0.68
     conditioning
    -0.65
     England
    -0.65
     pipes
    -0.64
     Ok
    -0.63
     pipe
    -0.63
     smoke
    -0.63
     cigarettes
    -0.62
     ESP
    -0.62
    POSITIVE LOGITS
    alan
    4.81
    abal
    1.20
    alos
    1.18
    atan
    1.18
    ala
    1.11
    assian
    1.10
    aris
    1.09
    alin
    1.08
    asma
    1.07
    anan
    1.07
    Act Density 0.017%

    No Known Activations