INDEX
    Explanations

    references to superhero movies and their production details

    New Auto-Interp
    Negative Logits
    岸
    -0.16
     Klopp
    -0.15
    ski
    -0.14
     Bun
    -0.14
    одав
    -0.14
    hausen
    -0.14
    pai
    -0.14
    ÑĪиб
    -0.14
     microbi
    -0.13
    ghi
    -0.13
    POSITIVE LOGITS
     Justice
    0.40
     DC
    0.36
    Justice
    0.34
     Aqu
    0.32
    DC
    0.31
     Snyder
    0.30
     justice
    0.30
    justice
    0.29
     Warner
    0.28
    Aqu
    0.28
    Act Density 0.015%

    No Known Activations