INDEX
    Explanations

    names of political figures from various parties and contexts

    New Auto-Interp
    Negative Logits
    Compare
    -0.74
    yz
    -0.65
     Lauder
    -0.61
    ..................
    -0.60
     Compare
    -0.57
     Rothschild
    -0.57
    EVA
    -0.56
    PN
    -0.55
    LIN
    -0.55
     Pwr
    -0.55
    POSITIVE LOGITS
    ELF
    1.08
    ullivan
    1.05
     own
    1.04
     newest
    0.92
    selves
    0.89
     favourite
    0.87
     favorite
    0.85
    pecially
    0.83
    kaya
    0.83
    lightly
    0.82
    Act Density 0.949%

    No Known Activations