INDEX
    Explanations

    phrases related to political positions and stances

    New Auto-Interp
    Negative Logits
    ennen
    -0.16
    опÑĢи
    -0.16
    ontent
    -0.15
    ONT
    -0.15
    Å©
    -0.14
    ONGO
    -0.14
    oufl
    -0.14
    ãĤĩ
    -0.14
    ··
    -0.13
    à¹ĩà¸ķาม
    -0.13
    POSITIVE LOGITS
     on
    0.77
     trên
    0.44
     عÙĦÙī
    0.41
     на
    0.41
     auf
    0.37
     på
    0.36
     regarding
    0.35
     pada
    0.33
    on
    0.33
    	on
    0.32
    Act Density 0.148%

    No Known Activations