INDEX
    Explanations

    performance comparison and results

    New Auto-Interp
    Negative Logits
     финансо
    0.64
     कानून
    0.61
    电子商务
    0.61
     політи
    0.55
    电商
    0.55
    0.55
     полити
    0.54
     официа
    0.54
    公开
    0.54
     유명
    0.54
    POSITIVE LOGITS
     consistently
    0.63
     results
    0.59
     decreased
    0.55
     low
    0.54
     moderately
    0.54
     scores
    0.53
     slightly
    0.52
     almost
    0.52
     showed
    0.51
     outperforms
    0.51
    Act Density 0.036%

    No Known Activations