INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    fees
    -0.08
     dias
    -0.08
    anne
    -0.08
     fees
    -0.07
     diplôm
    -0.07
     emoties
    -0.07
    klu
    -0.07
     années
    -0.07
    ительности
    -0.07
    ительных
    -0.07
    POSITIVE LOGITS
     utak
    0.09
     partido
    0.09
     λύ
    0.08
     pitcher
    0.08
     pict
    0.08
     Picture
    0.08
     pictures
    0.07
     capire
    0.07
     الصوت
    0.07
     picture
    0.07
    Act Density 0.005%

    No Known Activations