INDEX
    Explanations

    phrases related to politics and citizenship

    New Auto-Interp
    Negative Logits
    pher
    -0.71
    aceae
    -0.67
    mone
    -0.66
    affles
    -0.66
    ovan
    -0.64
    amus
    -0.64
    abbit
    -0.64
     Disciple
    -0.64
    idated
    -0.64
    apter
    -0.63
    POSITIVE LOGITS
    erton
    1.16
    screen
    1.02
     blown
    0.88
     fled
    0.87
     throttle
    0.82
    heartedly
    0.82
     frontal
    0.81
    fledged
    0.81
    blown
    0.81
     complement
    0.80
    Act Density 0.031%

    No Known Activations