INDEX
    Explanations

    age information mentioned in the text

    New Auto-Interp
    Negative Logits
     constitu
    -0.68
    ebin
    -0.63
    access
    -0.60
    izens
    -0.60
     dictators
    -0.60
     stabilization
    -0.57
    oids
    -0.57
     âī
    -0.57
     manifold
    -0.57
     elimination
    -0.57
    POSITIVE LOGITS
    %,
    0.88
     Rue
    0.83
    %-
    0.79
    yo
    0.78
    rd
    0.74
    th
    0.73
    %;
    0.72
     Downing
    0.71
    cm
    0.71
    ½
    0.70
    Act Density 0.052%

    No Known Activations