INDEX
    Explanations

    advertising

    New Auto-Interp
    Negative Logits
    QUOTE
    -0.07
     caf
    -0.07
     addicted
    -0.07
     kW
    -0.06
    Aff
    -0.06
     зап
    -0.06
    -0.06
     clinging
    -0.06
    PROGRAM
    -0.06
     Пред
    -0.06
    POSITIVE LOGITS
     invokes
    0.07
     illustrated
    0.07
     con
    0.06
     copied
    0.06
     inhabitants
    0.06
    артам
    0.06
    arts
    0.06
     عاشق
    0.06
    pl
    0.06
    essions
    0.06
    Act Density 0.002%

    No Known Activations