INDEX
    Explanations

    proper nouns related to people or places

    references to the term "den" with varying capitalization

    New Auto-Interp
    Negative Logits
     feats
    -0.62
     Typh
    -0.62
    EED
    -0.59
    olicy
    -0.59
    ties
    -0.58
    mph
    -0.57
    OTH
    -0.57
    heet
    -0.56
    ONS
    -0.56
    ments
    -0.56
    POSITIVE LOGITS
    izens
    1.21
    omination
    1.16
    izen
    1.14
    unciation
    1.12
    omin
    1.09
    arius
    1.03
    unci
    1.03
    zel
    1.03
    holm
    0.99
    ormal
    0.92
    Act Density 0.035%

    No Known Activations