INDEX
    Explanations

    references to academic articles or publications

    New Auto-Interp
    Head Attr Weights
    0:0.09
    1:0.03
    2:0.12
    3:0.13
    4:0.10
    5:0.07
    6:0.07
    7:0.03
    8:0.12
    9:0.08
    10:0.07
    11:0.03
    Negative Logits
    ographers
    -1.36
    tein
    -1.33
    alde
    -1.30
     Liber
    -1.26
     Lies
    -1.26
     Scores
    -1.24
    paio
    -1.22
    hower
    -1.21
     Editorial
    -1.21
     Literary
    -1.21
    POSITIVE LOGITS
    env
    1.39
    gam
    1.33
    bring
    1.17
    ffect
    1.16
     pint
    1.15
    bd
    1.15
    7601
    1.14
     headset
    1.14
    upt
    1.14
     bro
    1.13
    Act Density 0.001%

    No Known Activations