INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Athens
    -0.07
     fy
    -0.07
     Those
    -0.07
     Vz
    -0.07
    178
    -0.07
     doubts
    -0.07
     that
    -0.07
     boat
    -0.06
    )y
    -0.06
     dom
    -0.06
    POSITIVE LOGITS
     include
    0.09
    ide
    0.07
    CL
    0.07
     including
    0.07
    0.06
    جه
    0.06
    CLUDING
    0.06
    нее
    0.06
    MIC
    0.06
    including
    0.06
    Act Density 0.097%

    No Known Activations