INDEX
    Explanations

    phrases related to ratings and scores

    New Auto-Interp
    Negative Logits
    bus
    -0.14
    ittest
    -0.14
    Ù쨳
    -0.14
    alous
    -0.14
    ode
    -0.13
    etine
    -0.13
    PerPixel
    -0.13
     thôi
    -0.13
    peror
    -0.13
    noop
    -0.13
    POSITIVE LOGITS
     stars
    0.34
     rating
    0.27
    stars
    0.26
     Stars
    0.23
    rating
    0.23
     star
    0.23
     Rating
    0.22
    -stars
    0.21
     score
    0.20
    _rating
    0.20
    Act Density 0.049%

    No Known Activations