INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     novels
    -0.07
     astonished
    -0.07
     srpna
    -0.06
     diseases
    -0.06
     Four
    -0.06
     Frid
    -0.06
    swer
    -0.06
     nov
    -0.06
     Twelve
    -0.06
    (contents
    -0.06
    POSITIVE LOGITS
    CppMethod
    0.07
     žal
    0.07
    #aa
    0.07
    озвращ
    0.07
    CppI
    0.07
     HOLDER
    0.07
     illum
    0.06
     Dek
    0.06
     koşul
    0.06
    FORCE
    0.06
    Act Density 0.006%

    No Known Activations