INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.08
     Arch
    -0.07
     forfeiture
    -0.07
    Alter
    -0.07
     Lambert
    -0.07
     initComponents
    -0.06
    -0.06
    ают
    -0.06
     Architect
    -0.06
    -0.06
    POSITIVE LOGITS
     BCHP
    0.06
     крем
    0.06
    oppers
    0.06
    0.06
     hilarious
    0.06
    215
    0.06
     Grinding
    0.06
    Loads
    0.06
    Nm
    0.06
    urve
    0.06
    Act Density 0.012%

    No Known Activations