INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
    Стра
    1.13
    Созда
    1.03
    Furthermore
    1.02
    Ό
    1.00
    Исто
    0.96
    ين
    0.94
    ວຍ
    0.93
    ینگ
    0.91
    Sebagai
    0.91
    ным
    0.91
    POSITIVE LOGITS
    e
    1.02
     indignation
    1.02
     indigestion
    0.99
     hice
    0.98
     airline
    0.96
    eat
    0.94
    i
    0.94
    o
    0.94
    et
    0.93
     n
    0.93
    Act Density 0.049%

    No Known Activations