INDEX
    Explanations

    author names and their associated publications

    New Auto-Interp
    Negative Logits
    _("
    -0.15
    eprom
    -0.15
    warf
    -0.14
    _PCM
    -0.14
    leans
    -0.14
    IPH
    -0.14
    iliz
    -0.14
    lean
    -0.14
    harma
    -0.14
     mand
    -0.13
    POSITIVE LOGITS
    sko
    0.16
     Mueller
    0.14
    avra
    0.14
    ohana
    0.14
    ëģ
    0.13
    oyal
    0.13
    addock
    0.13
    Ấ
    0.13
    ayout
    0.13
    antal
    0.13
    Act Density 0.050%

    No Known Activations