INDEX
    Explanations

    words related to published works or academic studies

    instances of the word "published."

    New Auto-Interp
    Negative Logits
    aho
    -0.80
    nea
    -0.80
    xa
    -0.77
    llan
    -0.76
    hart
    -0.76
    atra
    -0.75
    avery
    -0.73
    ichael
    -0.72
    ascar
    -0.72
    uppet
    -0.72
    POSITIVE LOGITS
    lishing
    1.23
    lisher
    1.13
    published
    1.02
     published
    1.00
     publication
    0.98
     publishes
    0.96
     behavi
    0.94
    lishes
    0.93
     RELE
    0.92
    DragonMagazine
    0.91
    Act Density 0.022%

    No Known Activations