INDEX
    Explanations

    themes of power dynamics and choices in narrative contexts

    New Auto-Interp
    Negative Logits
    zk
    -0.16
     اÙĦتÙĨ
    -0.16
     cev
    -0.15
    /REC
    -0.15
    nech
    -0.15
    otify
    -0.14
    ायल
    -0.14
    ÙĬÙĪ
    -0.14
    ulumi
    -0.13
    Ļ
    -0.13
    POSITIVE LOGITS
     loose
    0.15
     Farr
    0.15
     ideal
    0.15
     Penny
    0.15
     Loose
    0.15
    ĴĪ
    0.14
     indirect
    0.14
    anian
    0.14
     all
    0.14
     Gret
    0.14
    Act Density 0.036%

    No Known Activations