INDEX
    Explanations

    references to plot twists and shocking elements in narratives

    New Auto-Interp
    Negative Logits
    ho
    -0.71
    ighth
    -0.70
    ittee
    -0.69
    elson
    -0.69
     largeDownload
    -0.68
    alty
    -0.68
    fort
    -0.67
    idated
    -0.67
    pora
    -0.66
     easing
    -0.66
    POSITIVE LOGITS
     happened
    0.96
     happens
    0.94
     Happ
    0.91
    ILE
    0.89
    ?!
    0.84
     ensued
    0.81
    TF
    0.80
    !?
    0.77
     happ
    0.76
    TY
    0.74
    Act Density 0.004%

    No Known Activations