INDEX
    Explanations

    references to film details and critiques

    New Auto-Interp
    Negative Logits
    appers
    -0.16
    ifix
    -0.15
    illa
    -0.15
    /tos
    -0.14
    igon
    -0.14
    ltk
    -0.14
     cá
    -0.14
    uria
    -0.14
     Potter
    -0.14
    alu
    -0.14
    POSITIVE LOGITS
    ÏĥÏĥ
    0.16
    igmoid
    0.15
    imple
    0.15
    istence
    0.14
    ekim
    0.14
    ysz
    0.14
    press
    0.14
    ril
    0.13
    Äĥr
    0.13
     Giuliani
    0.13
    Act Density 0.670%

    No Known Activations