INDEX
    Explanations

    mentions of specific actors and the "Pirates of the Caribbean" franchise

    New Auto-Interp
    Negative Logits
    ximo
    -0.15
    oba
    -0.15
    witter
    -0.14
     Oval
    -0.14
    ayment
    -0.14
    asan
    -0.13
    ues
    -0.13
     Mixer
    -0.13
    emics
    -0.13
    ario
    -0.13
    POSITIVE LOGITS
    loat
    0.16
    VML
    0.15
     èĩªåĬ¨çĶŁæĪIJ
    0.15
    ãĥ¼ãĥª
    0.14
    oulos
    0.14
    rame
    0.14
    ATAR
    0.14
    til
    0.14
    iset
    0.14
    istar
    0.14
    Act Density 0.005%

    No Known Activations