INDEX
    Explanations

    explaining or clarifying

    New Auto-Interp
    Negative Logits
     чтоб
    -1.37
    🪛
    -1.32
    แค่
    -1.31
    ोंने
    -1.30
     чтобы
    -1.27
     żeby
    -1.24
    beiros
    -1.24
    ​​​
    -1.24
     taky
    -1.23
    𓏸
    -1.23
    POSITIVE LOGITS
     this
    1.80
    NOTE
    1.31
     ristor
    1.17
     of
    1.16
     слід
    1.15
    就此
    1.13
    During
    1.13
    1.13
     този
    1.11
     onderstaande
    1.11
    Act Density 0.090%

    No Known Activations