INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Heart
    -0.07
     aliment
    -0.07
    لق
    -0.06
     хвор
    -0.06
     ور
    -0.06
     Fran
    -0.06
    holding
    -0.06
    ateg
    -0.06
     christ
    -0.06
    chemistry
    -0.06
    POSITIVE LOGITS
     двох
    0.07
    utto
    0.07
     زیادی
    0.07
     raced
    0.06
     Можно
    0.06
     domácí
    0.06
     COPYRIGHT
    0.06
    χρι
    0.06
    _/
    0.06
    0.06
    Act Density 0.011%

    No Known Activations