INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     воздуха
    0.52
    ¢
    0.45
    водит
    0.45
     детям
    0.44
     dùng
    0.43
     பக்த
    0.43
     Зак
    0.43
     அதிகம்
    0.43
     Сак
    0.42
     করলে
    0.42
    POSITIVE LOGITS
     plagiarism
    0.45
     Patreon
    0.43
     criminal
    0.42
     haven
    0.42
     reconciliation
    0.42
    o
    0.42
    解决方案
    0.42
     struggle
    0.41
     Tolkien
    0.41
     deceit
    0.41
    Act Density 0.011%

    No Known Activations