INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     cũng
    -0.06
     Scene
    -0.06
     parece
    -0.06
    neapolis
    -0.06
    apl
    -0.06
     دخ
    -0.06
    	ob
    -0.06
     pushing
    -0.06
     завжди
    -0.06
    λεκ
    -0.06
    POSITIVE LOGITS
     ratio
    0.06
    0.06
    .*
    0.06
    Interested
    0.06
     DACA
    0.06
    ポート
    0.06
     ratios
    0.06
    ENUM
    0.06
    0.06
     TERMS
    0.06
    Act Density 0.014%

    No Known Activations