INDEX
    Explanations

    Foreign language articles/pronouns

    New Auto-Interp
    Negative Logits
     brutal
    -0.06
     INTER
    -0.06
    Imp
    -0.06
     calling
    -0.06
     ماشین
    -0.06
    Insurance
    -0.06
    ียนร
    -0.06
    -0.06
    روس
    -0.06
    iembre
    -0.06
    POSITIVE LOGITS
     các
    0.13
     những
    0.09
    Các
    0.09
     conco
    0.08
     Các
    0.08
     tale
    0.07
    TAG
    0.07
     τη
    0.07
     mga
    0.07
    Lane
    0.07
    Act Density 0.011%

    No Known Activations