INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     fær
    -0.84
     hulle
    -0.84
     hvordan
    -0.82
     tirs
    -0.81
    руем
    -0.80
    scraper
    -0.80
    adilan
    -0.79
     أو
    -0.79
     FERR
    -0.79
    UpdateDate
    -0.78
    POSITIVE LOGITS
     margins
    0.98
     gutter
    0.86
     dragging
    0.83
     focus
    0.82
    Děkuji
    0.81
    translate
    0.81
     described
    0.79
     showing
    0.78
     mobile
    0.77
     annoying
    0.76
    Act Density 0.015%

    No Known Activations