INDEX
    Explanations

    movie-related terms

    references to movies or the word "movie."

    New Auto-Interp
    Negative Logits
    ilities
    -0.79
    raham
    -0.71
    sembly
    -0.69
     Haitian
    -0.68
    urst
    -0.68
    withstanding
    -0.68
    otic
    -0.68
    cffff
    -0.68
    ords
    -0.67
    functional
    -0.66
    POSITIVE LOGITS
     theater
    1.08
     theaters
    1.07
    goers
    1.06
     movies
    1.00
     theatre
    0.98
     movie
    0.96
     theat
    0.89
    eers
    0.89
     premie
    0.88
     Movies
    0.88
    Act Density 0.039%

    No Known Activations