INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ldre
    -0.07
     Apostle
    -0.07
     kháng
    -0.06
    طان
    -0.06
    165
    -0.06
     prostituerade
    -0.06
    chts
    -0.06
    ckt
    -0.06
    oul
    -0.06
    цький
    -0.06
    POSITIVE LOGITS
     fee
    0.19
     fees
    0.17
     Fee
    0.16
     Fees
    0.14
    Fee
    0.13
    _fee
    0.11
    fee
    0.11
     Levy
    0.09
    _fu
    0.08
    EF
    0.08
    Act Density 0.006%

    No Known Activations