INDEX
    Explanations

    instances of the word "directed" indicating film direction

    New Auto-Interp
    Negative Logits
    iliar
    -0.17
    rious
    -0.17
    mina
    -0.17
    mil
    -0.15
    Ìĥ
    -0.15
    ascar
    -0.15
    δεÏĤ
    -0.14
    lamaz
    -0.14
    ILLA
    -0.14
     mili
    -0.14
    POSITIVE LOGITS
    اÛĮÙĩ
    0.14
    achsen
    0.14
    alth
    0.14
     loyalty
    0.14
    á»ı
    0.14
     sandwich
    0.13
    HER
    0.13
    imento
    0.13
    Une
    0.13
     Sandwich
    0.13
    Act Density 0.005%

    No Known Activations