INDEX
    Explanations

    instances of the term "review" in various contexts

    New Auto-Interp
    Negative Logits
     Reviews
    -0.20
     review
    -0.20
    Reviews
    -0.19
     reviews
    -0.18
     reviewed
    -0.17
    vince
    -0.17
    unter
    -0.16
    _reviews
    -0.16
    reviews
    -0.16
    cht
    -0.16
    POSITIVE LOGITS
    able
    0.26
    ees
    0.24
    ers
    0.23
    ee
    0.21
    ables
    0.19
    /meta
    0.19
    ABLE
    0.18
    /comment
    0.18
    eing
    0.17
    avar
    0.17
    Act Density 0.039%

    No Known Activations