INDEX
    Explanations

    references to a specific location ("Aber") within the context of a news article

    New Auto-Interp
    Negative Logits
    atform
    -0.88
    iers
    -0.81
    TPS
    -0.79
    ipeg
    -0.79
    ership
    -0.77
    iets
    -0.77
    ingham
    -0.75
    ivity
    -0.73
    ergic
    -0.72
    iating
    -0.71
    POSITIVE LOGITS
    ration
    1.14
    rant
    1.00
    rations
    0.92
    deen
    0.86
    thur
    0.85
    ansas
    0.82
    rants
    0.78
    rious
    0.76
    rated
    0.73
    odied
    0.71
    Act Density 0.046%

    No Known Activations