INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    à´·
    -0.27
    icamente
    -0.27
    iddle
    -0.27
    enth
    -0.25
     differentiate
    -0.25
    idores
    -0.25
    :length
    -0.25
    éľĦ
    -0.25
     diferencia
    -0.24
    æĦŁ
    -0.24
    POSITIVE LOGITS
    DOI
    0.28
     gle
    0.27
    æĢ»æĶ¶åħ¥
    0.27
     agreed
    0.25
    =head
    0.25
    Vac
    0.24
    ä½ľèĢħæľ¬äºº
    0.24
    UCT
    0.23
    è¾ļ
    0.23
     AUTHOR
    0.23
    Act Density 0.005%

    No Known Activations