INDEX
    Explanations

    references to political events and figures

    New Auto-Interp
    Negative Logits
    agna
    -0.17
    .sax
    -0.15
    _RB
    -0.14
    adow
    -0.14
    eton
    -0.13
    obar
    -0.13
    hed
    -0.13
    orro
    -0.13
    isses
    -0.13
    akra
    -0.13
    POSITIVE LOGITS
    代
    0.15
    avern
    0.15
     mandates
    0.15
     distant
    0.14
     natural
    0.14
     mandate
    0.14
    /renderer
    0.14
    xic
    0.14
    βι
    0.13
    incip
    0.13
    Act Density 0.034%

    No Known Activations