INDEX
    Explanations

    references to countries, states, political figures, and government-related topics

    mentions of political entities and related organizations

    New Auto-Interp
    Negative Logits
    sed
    -0.71
    urated
    -0.66
    named
    -0.66
    attr
    -0.65
    £ı
    -0.64
    lished
    -0.64
    è£
    -0.62
    _>
    -0.62
    mentioned
    -0.61
    >(
    -0.61
    POSITIVE LOGITS
     needs
    1.40
     should
    1.32
     shouldn
    1.29
     cannot
    1.28
     lacks
    1.28
     deserves
    1.27
     ought
    1.26
     intends
    1.25
     must
    1.21
    needs
    1.16
    Act Density 0.450%

    No Known Activations