INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    protoimpl
    -0.52
     Always
    -0.50
    nostic
    -0.50
     EClass
    -0.49
     etc
    -0.48
     I
    -0.46
     Both
    -0.46
     dagegen
    -0.46
     Otherwise
    -0.46
    ضي
    -0.45
    POSITIVE LOGITS
     thanks
    2.21
     courtesy
    1.52
     gracias
    1.49
     graças
    1.48
    thanks
    1.39
     grazie
    1.33
    courtesy
    1.31
     grâce
    1.21
     THANKS
    1.18
     owing
    1.14
    Act Density 0.143%

    No Known Activations