INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    1.16
    ے
    1.06
    emphasis
    1.03
    гле
    1.02
     albeit
    1.00
     hearted
    0.99
    0.99
     golden
    0.98
     anden
    0.98
     allegation
    0.98
    POSITIVE LOGITS
     riscos
    1.35
     transferência
    1.31
    ciation
    1.27
     redor
    1.26
     cierta
    1.24
     peligros
    1.24
    amese
    1.24
     стала
    1.23
    ্ধ্য
    1.22
     tecnica
    1.21
    Act Density 0.023%

    No Known Activations