INDEX
    Explanations

    references to the word "who" in various contexts, indicating a focus on questions about identity or roles within a narrative

    New Auto-Interp
    Negative Logits
    ÑĢалÑĮ
    -0.16
    woo
    -0.16
    ault
    -0.15
    pute
    -0.15
    spi
    -0.15
    rana
    -0.15
    rogram
    -0.14
    181
    -0.13
    erais
    -0.13
    atik
    -0.13
    POSITIVE LOGITS
     else
    0.35
    _else
    0.24
     ELSE
    0.21
    soever
    0.21
    /how
    0.20
     exactly
    0.20
     Else
    0.19
    else
    0.18
    	else
    0.18
    osh
    0.18
    Act Density 0.029%

    No Known Activations