INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    <unused481>
    0.32
    ّل
    0.30
     Każ
    0.30
    <unused1975>
    0.30
     fasse
    0.30
     proporcion
    0.29
    <unused1012>
    0.29
     celui
    0.29
     سازمان
    0.29
    <unused108>
    0.29
    POSITIVE LOGITS
     typology
    0.46
     Handbook
    0.40
     handbook
    0.38
     monograph
    0.38
    Handbook
    0.38
     typical
    0.37
     interpretation
    0.35
     monographs
    0.35
     practical
    0.34
     cookbook
    0.34
    Act Density 0.000%

    No Known Activations