INDEX
    Explanations

    mentions of political parties and their affiliations

    New Auto-Interp
    Negative Logits
     Fork
    -0.15
    favor
    -0.15
    essaging
    -0.15
     subs
    -0.15
    éĮ²
    -0.14
    ENDOR
    -0.14
    ILING
    -0.14
    à¥Ģश
    -0.14
    ıb
    -0.14
    Lng
    -0.14
    POSITIVE LOGITS
    imits
    0.16
    миÑĤ
    0.15
    aggio
    0.15
    eless
    0.14
    .createFrom
    0.14
    agal
    0.14
    assi
    0.14
    andler
    0.14
    hardt
    0.13
    ваÑĤ
    0.13
    Act Density 0.014%

    No Known Activations