INDEX
    Explanations

    concepts related to unexpected plot developments or surprises in narratives

    New Auto-Interp
    Negative Logits
    ipple
    -0.16
    mon
    -0.15
    ground
    -0.14
     liá»ĩu
    -0.14
    HY
    -0.14
    ween
    -0.14
     Callable
    -0.14
    lite
    -0.14
    luet
    -0.14
    DonaldTrump
    -0.13
    POSITIVE LOGITS
    thal
    0.20
     twist
    0.19
    ÑĢабаÑĤ
    0.15
    ero
    0.15
    ī
    0.14
    ych
    0.14
    aģı
    0.14
    arily
    0.14
    ixin
    0.14
     Zucker
    0.14
    Act Density 0.047%

    No Known Activations