INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     trim
    -0.06
    .Utility
    -0.06
    <Menu
    -0.06
     want
    -0.06
     Cult
    -0.06
     data
    -0.06
     texts
    -0.06
     کم
    -0.06
     extremists
    -0.06
     Fall
    -0.06
    POSITIVE LOGITS
     Review
    0.12
     review
    0.11
     reviews
    0.09
     REVIEW
    0.08
    Review
    0.08
    _MACRO
    0.07
    FRINGEMENT
    0.07
    -contained
    0.07
    review
    0.07
    .quick
    0.06
    Act Density 0.020%

    No Known Activations