INDEX
    Explanations

    references to narratives and storytelling elements

    New Auto-Interp
    Negative Logits
    ities
    -0.15
    /bus
    -0.15
    estar
    -0.15
    itr
    -0.15
    als
    -0.15
    zelf
    -0.15
    sei
    -0.14
    rupt
    -0.14
    ï¸ı
    -0.14
    ustos
    -0.14
    POSITIVE LOGITS
     told
    0.33
    book
    0.24
    books
    0.21
    tell
    0.20
    lines
    0.20
    boards
    0.20
    boarding
    0.19
    -t
    0.19
    elling
    0.18
    åijĬè¯ī
    0.18
    Act Density 0.051%

    No Known Activations