INDEX
    Explanations

    references to original works or concepts

    New Auto-Interp
    Negative Logits
     zast
    -0.14
    izard
    -0.14
     ic
    -0.14
    554
    -0.14
    lish
    -0.14
    akh
    -0.14
    vÄĽ
    -0.14
    ands
    -0.14
    allo
    -0.13
     toi
    -0.13
    POSITIVE LOGITS
    /original
    0.25
     original
    0.19
     Original
    0.18
     ORIGINAL
    0.17
    -original
    0.16
    ajo
    0.16
    etty
    0.16
    original
    0.15
     erotico
    0.15
    (original
    0.15
    Act Density 0.141%

    No Known Activations