INDEX
    Explanations

    phrases related to politics and government, including names of political figures, policy proposals, and official statements

    New Auto-Interp
    Negative Logits
     malheure
    -1.02
     étrang
    -1.00
     carrefour
    -1.00
     plong
    -0.97
     malheureux
    -0.93
     prétend
    -0.88
     héro
    -0.86
     cahier
    -0.84
     hcm
    -0.83
     Miscell
    -0.82
    POSITIVE LOGITS
     aware
    0.73
     able
    0.72
     glad
    0.71
     ready
    0.71
     willing
    0.71
     afraid
    0.68
     gonna
    0.68
     proud
    0.68
     unable
    0.67
     pleased
    0.67
    Act Density 0.249%

    No Known Activations