INDEX
    Explanations

    phrases related to events happening at a later time

    phrases indicating later revelations or statements in the text

    New Auto-Interp
    Negative Logits
     Container
    -0.70
    Plot
    -0.69
    minecraft
    -0.68
    afety
    -0.67
    yet
    -0.67
    cius
    -0.66
     encyclopedia
    -0.62
     Mahjong
    -0.61
    anium
    -0.61
     Simpl
    -0.60
    POSITIVE LOGITS
     recons
    0.87
     forg
    0.80
     wiser
    0.79
    livion
    0.72
     regretted
    0.72
     relent
    0.71
    pez
    0.71
     realise
    0.67
     sidx
    0.64
    }\
    0.63
    Act Density 0.221%

    No Known Activations