INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     heroin
    -0.10
     Kamer
    -0.09
     herpes
    -0.08
     konkur
    -0.08
     pró
    -0.08
    akore
    -0.08
     chid
    -0.08
     camer
    -0.08
    Coronavirus
    -0.08
     rie
    -0.08
    POSITIVE LOGITS
    _df
    0.09
    שת
    0.08
    _c
    0.08
    Score
    0.08
    _s
    0.08
    -related
    0.08
    _tf
    0.08
    _to
    0.08
    _data
    0.08
    _ctx
    0.08
    Act Density 0.036%

    No Known Activations