INDEX
    Explanations

    references to ratings and reviews of media, particularly books and films

    New Auto-Interp
    Negative Logits
    ìĩ
    -0.07
    ç¸
    -0.07
    ÃŃst
    -0.07
    ercial
    -0.07
    ãĤ¦ãĥĪ
    -0.07
    _TW
    -0.06
    åĪº
    -0.06
    elf
    -0.06
    leh
    -0.06
    plode
    -0.06
    POSITIVE LOGITS
    reviews
    0.08
     rating
    0.08
     reviews
    0.07
    rating
    0.07
     ratings
    0.07
    -rating
    0.07
    Reviews
    0.07
    å¦Ļ
    0.06
     Reviews
    0.06
    ÑĢей
    0.06
    Act Density 0.004%

    No Known Activations