INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    readAll
    -0.48
    ReadAll
    -0.47
    !("{
    -0.46
     Públicas
    -0.42
    endphp
    -0.41
    mulher
    -0.41
     Allgeme
    -0.41
     Haush
    -0.40
     publiques
    -0.39
     Compañ
    -0.39
    POSITIVE LOGITS
     incentive
    1.90
     Incentive
    1.85
     incentives
    1.79
    Incenti
    1.73
     Incentives
    1.72
    incenti
    1.66
     incenti
    1.26
     Incenti
    1.19
     incentiv
    1.16
     reward
    0.87
    Act Density 0.004%

    No Known Activations