INDEX
    Explanations

    uphill and downhill movement

    New Auto-Interp
    Negative Logits
    一颗
    1.01
     getContentPane
    0.97
     notifies
    0.94
    ະພັນ
    0.90
    ⦿
    0.88
     batang
    0.87
     entitles
    0.87
     válida
    0.86
    nymi
    0.86
    0.86
    POSITIVE LOGITS
    fra
    0.77
    0.73
    den
    0.69
    л
    0.69
    i
    0.68
    ীন
    0.66
    ie
    0.65
    0.65
    del
    0.65
     allá
    0.64
    Act Density 0.003%

    No Known Activations