INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     adlı
    0.44
     adrenalin
    0.42
     disinformation
    0.42
     config
    0.42
     friendship
    0.40
     disadvantage
    0.40
     platform
    0.40
    '
    0.40
     hangi
    0.39
     dodgy
    0.39
    POSITIVE LOGITS
     таксама
    0.52
     початку
    0.50
     начиная
    0.46
    Throughout
    0.44
    Pregnant
    0.44
     थी
    0.44
     толькі
    0.43
    ปกติ
    0.42
    чках
    0.42
     भी
    0.41
    Act Density 0.010%

    No Known Activations