INDEX
    Explanations

    mentions of political or controversial topics

    New Auto-Interp
    Negative Logits
     juges
    -0.66
    mistak
    -0.61
     vœux
    -0.58
    zijde
    -0.58
     modalités
    -0.56
     exemplaires
    -0.53
     prochaines
    -0.53
     évaluations
    -0.50
    thinkable
    -0.50
    relenting
    -0.50
    POSITIVE LOGITS
     palab
    0.74
    fordable
    0.72
     felipe
    0.69
     palio
    0.66
    ñora
    0.66
     doman
    0.66
     hcm
    0.65
     laci
    0.65
     romero
    0.64
     juf
    0.64
    Act Density 0.240%

    No Known Activations