INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ingat
    -0.08
    _typ
    -0.08
     Industri
    -0.08
     Typ
    -0.08
    harap
    -0.07
    (Layout
    -0.07
    _atom
    -0.07
     Hydra
    -0.07
    Typ
    -0.07
     typ
    -0.07
    POSITIVE LOGITS
     paperwork
    0.08
     бумаги
    0.08
     linkage
    0.08
    forge
    0.08
    papers
    0.08
    bourne
    0.08
     allegiance
    0.08
     plagiarism
    0.08
    imore
    0.08
    bewijs
    0.07
    Act Density 0.006%

    No Known Activations