INDEX
    Explanations

    names and titles related to political figures and their affiliations

    New Auto-Interp
    Negative Logits
    abble
    -0.17
    imli
    -0.16
    ÑĮÑİ
    -0.15
    ayi
    -0.14
    yne
    -0.14
     Kraft
    -0.14
    place
    -0.14
    vale
    -0.14
    .mixin
    -0.14
    esi
    -0.13
    POSITIVE LOGITS
    ognition
    0.16
    á»Ļn
    0.15
    enville
    0.14
    ç²¾ç¥ŀ
    0.14
     spit
    0.14
    asher
    0.14
    κη
    0.14
    andid
    0.13
    947
    0.13
    ÑĻ
    0.13
    Act Density 0.014%

    No Known Activations