INDEX
    Explanations

    references to movies or actors

    instances of the word "starring" in relation to movies and performances

    New Auto-Interp
    Negative Logits
    utral
    -0.85
    abus
    -0.77
    regulated
    -0.76
    Ĥİ
    -0.74
    apa
    -0.73
    veyard
    -0.72
    nea
    -0.72
    oard
    -0.71
    nsic
    -0.71
    adem
    -0.71
    POSITIVE LOGITS
     starring
    1.19
    Dust
    0.78
     stars
    0.77
    ãĤ¤ãĥĪ
    0.76
    Credits
    0.76
     Actress
    0.75
     Pengu
    0.75
     starred
    0.75
    stars
    0.71
     Features
    0.71
    Act Density 0.008%

    No Known Activations