INDEX
    Explanations

    words related to legal and criminal activities and proceedings

    nouns that relate to people, properties, and groups

    New Auto-Interp
    Negative Logits
     experien
    -0.67
    rior
    -0.66
    lier
    -0.64
    Phys
    -0.64
    rolog
    -0.63
    graph
    -0.62
    à¨
    -0.61
    OUS
    -0.60
    à¤
    -0.59
    âĸ¬
    -0.58
    POSITIVE LOGITS
    hip
    1.14
    cape
    1.10
    heet
    1.02
    etter
    0.99
    mith
    0.97
    ilver
    0.93
    etting
    0.92
    poons
    0.91
    ettings
    0.90
    hips
    0.89
    Act Density 0.761%

    No Known Activations