INDEX
    Explanations

    mentions of research papers being published in various journals

    references to academic publications

    New Auto-Interp
    Negative Logits
    llan
    -0.91
    vette
    -0.83
    xa
    -0.78
    hart
    -0.77
    zh
    -0.70
    ovan
    -0.69
    ggle
    -0.68
    heed
    -0.67
    nea
    -0.66
    aho
    -0.66
    POSITIVE LOGITS
    lishing
    1.06
    lisher
    0.99
     excerpts
    0.92
    lishes
    0.79
     newsp
    0.78
    DragonMagazine
    0.75
    Ô
    0.75
     behavi
    0.74
    gres
    0.74
    çīĪ
    0.71
    Act Density 0.025%

    No Known Activations