INDEX
    Explanations

    mentions of scientific research being published in journals

    occurrences of the word "published."

    New Auto-Interp
    Negative Logits
    llan
    -0.90
    hart
    -0.77
    xa
    -0.71
    vette
    -0.68
     Architects
    -0.68
    Ĭ±
    -0.67
     awakened
    -0.66
    aturation
    -0.65
    uth
    -0.65
    ichael
    -0.65
    POSITIVE LOGITS
    lishing
    0.99
     excerpts
    0.93
    lisher
    0.92
    lishes
    0.75
    Ô
    0.70
     behavi
    0.70
    itatively
    0.69
     exploits
    0.69
    URL
    0.69
     newsp
    0.68
    Act Density 0.027%

    No Known Activations