INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     risky
    -0.07
     rise
    -0.07
     iler
    -0.07
     akce
    -0.07
    _jobs
    -0.07
    wicklung
    -0.07
     Eagle
    -0.07
    Grad
    -0.06
     folks
    -0.06
     sage
    -0.06
    POSITIVE LOGITS
     contains
    0.14
     contain
    0.14
     containing
    0.12
     contained
    0.10
     contaminated
    0.09
    contains
    0.09
    contain
    0.08
     Contains
    0.08
    0.08
    Contains
    0.08
    Act Density 0.036%

    No Known Activations