INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Head Attr Weights
    0:0.08
    1:0.08
    2:0.08
    3:0.08
    4:0.08
    5:0.09
    6:0.08
    7:0.08
    8:0.08
    9:0.07
    10:0.09
    11:0.07
    Negative Logits
     Aires
    -2.71
    ulner
    -2.46
    bright
    -2.43
     Soph
    -2.33
    irens
    -2.30
    scl
    -2.27
    hea
    -2.25
    tmp
    -2.23
     Bohem
    -2.22
    nel
    -2.19
    POSITIVE LOGITS
     Publication
    2.63
    ipedia
    2.51
     textual
    2.50
     encyclopedia
    2.46
     preferential
    2.39
    iety
    2.37
     aggregate
    2.32
     favorably
    2.30
     Publications
    2.26
     Article
    2.26
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.