INDEX
    Explanations

    mentions of a specific location or establishment, particularly associated with food or events

    New Auto-Interp
    Negative Logits
    erate
    -0.18
    erdem
    -0.17
    hood
    -0.17
    jay
    -0.15
     principle
    -0.15
    umes
    -0.14
    heiro
    -0.14
    erca
    -0.14
    eru
    -0.14
    rophe
    -0.13
    POSITIVE LOGITS
    oz
    0.17
    ucky
    0.16
    allet
    0.15
    _deinit
    0.15
    stead
    0.15
    endTime
    0.15
    oen
    0.15
    verbatim
    0.14
    avo
    0.14
    reeNode
    0.14
    Act Density 0.016%

    No Known Activations