INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ijing
    -0.09
     مشابه
    -0.08
    -stage
    -0.08
    касці
    -0.08
    касць
    -0.08
    ikuva
    -0.08
    how
    -0.08
    方面
    -0.08
    antage
    -0.08
    าษ
    -0.08
    POSITIVE LOGITS
     verplicht
    0.10
     NEVER
    0.08
     MUST
    0.08
     INV
    0.08
    തിന്
    0.08
     intoxic
    0.08
     обязатель
    0.08
     debes
    0.08
     solltest
    0.08
     forgot
    0.08
    Act Density 0.000%

    No Known Activations