INDEX
    Explanations

    film-related words, possibly related to reviews or evaluations

    references to films and movies

    New Auto-Interp
    Negative Logits
     condition
    -0.65
     wheelchair
    -0.62
     bluff
    -0.62
     ridge
    -0.60
    stone
    -0.60
    LESS
    -0.59
     Islanders
    -0.59
    FUL
    -0.59
     mechanism
    -0.59
     nurse
    -0.58
    POSITIVE LOGITS
    ynthesis
    1.03
    earch
    0.96
    ovies
    0.95
    uggest
    0.89
    ystem
    0.88
    chool
    0.87
    ensitive
    0.87
    cape
    0.86
    aurus
    0.85
     starring
    0.84
    Act Density 0.049%

    No Known Activations