INDEX
    Explanations

    names of political figures or terms related to political events

    New Auto-Interp
    Negative Logits
    glers
    -0.94
    ahime
    -0.75
    istically
    -0.66
     [|
    -0.66
    ERY
    -0.65
    VILLE
    -0.65
    phal
    -0.62
    beit
    -0.62
     Cage
    -0.61
     Leap
    -0.61
    POSITIVE LOGITS
    orters
    1.50
    rint
    1.36
    orter
    1.30
    rieve
    1.28
    utations
    1.25
    ublic
    1.22
    utation
    1.21
    ository
    1.20
    orted
    1.16
    atri
    1.16
    Act Density 0.510%

    No Known Activations