INDEX
    Explanations

    mentions of a specific place name

    mentions of the name "Nam."

    New Auto-Interp
    Negative Logits
     Engels
    -0.71
     understatement
    -0.66
     forecast
    -0.64
     Blackwell
    -0.62
    UID
    -0.62
     extrap
    -0.60
     angels
    -0.60
     devil
    -0.59
     inhib
    -0.58
     subparagraph
    -0.57
    POSITIVE LOGITS
    nam
    1.19
    orously
    1.01
    ned
    0.99
    ovember
    0.98
    eless
    0.98
    essage
    0.96
    ilitary
    0.95
    forth
    0.93
    azing
    0.93
    emon
    0.92
    Act Density 0.019%

    No Known Activations