INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    .rooms
    -0.08
    /month
    -0.07
     duvar
    -0.07
    _Tab
    -0.06
    /action
    -0.06
     dirty
    -0.06
    .week
    -0.06
     Trinidad
    -0.06
     Europe
    -0.06
    пион
    -0.06
    POSITIVE LOGITS
     rebel
    0.12
     Rebels
    0.11
     rebels
    0.10
     Rebel
    0.08
     insurg
    0.07
    gle
    0.07
     quake
    0.06
     seize
    0.06
    qry
    0.06
     inversion
    0.06
    Act Density 0.004%

    No Known Activations