INDEX
    Explanations

    mentions of immigration and associated societal reactions

    New Auto-Interp
    Negative Logits
    isto
    -0.16
    eton
    -0.15
    unta
    -0.15
    .shiro
    -0.14
     Labels
    -0.14
     Jan
    -0.14
    rowsable
    -0.14
     ÑģÑħем
    -0.14
     callable
    -0.13
    ots
    -0.13
    POSITIVE LOGITS
    375
    0.18
     Byrne
    0.15
    .pattern
    0.15
    ноÑĪ
    0.14
    aldi
    0.14
     éĥ
    0.14
     pol
    0.14
    λÏĮγ
    0.14
    .AWS
    0.14
    685
    0.14
    Act Density 0.141%

    No Known Activations