INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     
    0.98
    \
    0.91
    ^*\
    0.84
    0.83
     заяви
    0.79
    <0x0D>
    0.78
    ної
    0.77
    s
    0.77
    ία
    0.75
    ني
    0.73
    POSITIVE LOGITS
    ir
    1.12
    ia
    1.09
     reliable
    1.06
     unreliable
    1.06
    可靠
    1.05
     reliability
    1.02
    reliability
    1.02
    信頼
    1.00
     zuverläss
    1.00
    िंग
    0.97
    Act Density 0.015%

    No Known Activations