INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     uph
    0.58
     juries
    0.58
     institutes
    0.55
     contesting
    0.54
    ocrats
    0.51
    manuel
    0.50
     instituted
    0.50
     punished
    0.50
     agencies
    0.49
     uphold
    0.48
    POSITIVE LOGITS
     Características
    0.56
     Можно
    0.55
    ↵↵
    0.53
    0.52
    و
    0.52
    0.52
    לו
    0.51
     Màu
    0.51
     prodotto
    0.50
    氧化
    0.50
    Act Density 0.000%

    No Known Activations