INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ра
    1.13
    T
    1.11
    >
    1.03
    ل
    1.01
    Ι
    1.00
    скохозяй
    0.95
    ا
    0.95
    рана
    0.93
    ره
    0.92
    rint
    0.92
    POSITIVE LOGITS
    AndView
    1.19
    heet
    1.13
    osi
    1.13
    pp
    1.09
     duas
    1.08
     ppt
    1.05
    chanics
    1.04
    1.03
    ಿದ
    1.02
    jectory
    1.02
    Act Density 0.001%

    No Known Activations