INDEX
    Explanations

    declarative statements

    New Auto-Interp
    Negative Logits
     archival
    -0.08
     tug
    -0.07
     Protect
    -0.07
     Thrive
    -0.07
     پال
    -0.07
     runaway
    -0.07
     Rehab
    -0.07
     نرم
    -0.07
     simultaneously
    -0.07
     التمو
    -0.07
    POSITIVE LOGITS
     nostri
    0.09
    さて
    0.09
     स्वर
    0.08
     formação
    0.08
     endereco
    0.08
    0.08
     escrita
    0.08
    회사
    0.08
    また
    0.08
    Squares
    0.07
    Act Density 0.006%

    No Known Activations