INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Empires
    -0.77
    INGTON
    -0.69
    DonaldTrump
    -0.68
    selves
    -0.65
     Pradesh
    -0.63
     consulate
    -0.63
    izons
    -0.61
     Centauri
    -0.60
     fel
    -0.59
     bottleneck
    -0.59
    POSITIVE LOGITS
    urally
    1.03
    aneers
    0.86
    books
    0.86
    aign
    0.85
    ural
    0.82
    builder
    0.81
    friend
    0.78
     coached
    0.77
    masters
    0.76
    anson
    0.75
    Act Density 0.021%

    No Known Activations