INDEX
    Explanations

    words related to past events or situations

    references to specific groups of people or entities

    New Auto-Interp
    Negative Logits
    ........
    -0.74
    ................
    -0.74
    ........................
    -0.64
    .............
    -0.64
    ................................
    -0.63
    .........
    -0.63
     Apart
    -0.63
     Volcano
    -0.60
     whichever
    -0.59
     435
    -0.58
    POSITIVE LOGITS
     survived
    0.94
    ppers
    0.91
     participated
    0.90
    ever
    0.86
     interacted
    0.84
    oped
    0.84
     frequ
    0.82
     disliked
    0.81
     cared
    0.80
     ventured
    0.80
    Act Density 0.162%

    No Known Activations