INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     и
    0.42
     Госпо
    0.41
     удобно
    0.41
     считают
    0.39
    вающая
    0.39
     potty
    0.39
     utilize
    0.38
    Tennessee
    0.38
     way
    0.36
     венти
    0.36
    POSITIVE LOGITS
    データベース
    0.63
     database
    0.61
    数据库
    0.60
     databases
    0.59
     Datenbank
    0.59
    Database
    0.56
     Database
    0.55
     filenames
    0.52
     katalog
    0.51
    database
    0.50
    Act Density 0.003%

    No Known Activations