INDEX
    Explanations

    words related to physical locations or structures

    references to hospital wards

    New Auto-Interp
    Negative Logits
    ãĥ¤
    -0.77
    ãĥĪ
    -0.75
    istine
    -0.75
     Hav
    -0.70
    igslist
    -0.69
    pheus
    -0.69
    ctory
    -0.68
    Bon
    -0.67
    DonaldTrump
    -0.66
    issance
    -0.66
    POSITIVE LOGITS
    robe
    1.04
     ward
    1.03
    room
    1.02
     wards
    0.86
    rooms
    0.85
    masters
    0.84
    nton
    0.83
    ring
    0.83
    stones
    0.80
    lings
    0.79
    Act Density 0.006%

    No Known Activations