INDEX
    Explanations

    references to social and economic policies, especially those that involve wealth distribution and political constructs that influence public perception

    New Auto-Interp
    Negative Logits
    .springboot
    -0.16
    .zh
    -0.15
     doz
    -0.14
    roz
    -0.13
     CumhurbaÅŁ
    -0.13
     kvinnor
    -0.13
    ossal
    -0.13
    idak
    -0.13
    urst
    -0.13
    ç´ł
    -0.13
    POSITIVE LOGITS
     mas
    0.27
     via
    0.25
    via
    0.23
     through
    0.22
    aim
    0.21
     masked
    0.21
     by
    0.21
    abet
    0.20
     clo
    0.20
     aimed
    0.20
    Act Density 0.358%

    No Known Activations