INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     huz
    -0.08
    .az
    -0.07
    .win
    -0.07
     learn
    -0.07
     holidays
    -0.07
     cím
    -0.07
    -0.07
    .listeners
    -0.07
    Battle
    -0.07
    .pt
    -0.07
    POSITIVE LOGITS
     Indicates
    0.11
     تاکید
    0.09
     denotes
    0.08
     обознач
    0.08
    gaat
    0.08
     bedoeld
    0.08
     indicar
    0.08
     indicating
    0.08
    ეა
    0.08
    Anything
    0.08
    Act Density 0.013%

    No Known Activations