INDEX
    Explanations

    references to storytelling and narrative structures

    New Auto-Interp
    Negative Logits
    /ros
    -0.16
    inspace
    -0.15
    iek
    -0.15
    ister
    -0.15
    æīķ
    -0.15
    amax
    -0.14
    ppard
    -0.14
    isz
    -0.14
    iors
    -0.14
    rollo
    -0.14
    POSITIVE LOGITS
    oog
    0.16
    vens
    0.14
    257
    0.14
    wang
    0.14
    umat
    0.14
     Her
    0.14
    unga
    0.13
    assic
    0.13
    urally
    0.13
    up
    0.13
    Act Density 0.006%

    No Known Activations