INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     standard
    -0.62
    Демографія
    -0.61
    standard
    -0.54
     tør
    -0.51
    Standard
    -0.48
     aveug
    -0.48
     compréhen
    -0.48
    func
    -0.47
    hofen
    -0.47
    Carex
    -0.47
    POSITIVE LOGITS
    PMailer
    0.90
    MigrationBuilder
    0.81
     Exactos
    0.75
     CreateTagHelper
    0.73
     otomatig
    0.70
     Wicidata
    0.68
    saraba
    0.68
    apimachinery
    0.67
     snippetHide
    0.66
     Vikipedi
    0.66
    Act Density 0.052%

    No Known Activations