INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Head Attr Weights
    0:0.08
    1:0.08
    2:0.07
    3:0.08
    4:0.08
    5:0.08
    6:0.08
    7:0.07
    8:0.08
    9:0.09
    10:0.08
    11:0.08
    Negative Logits
     strengths
    -1.69
     Cosponsors
    -1.68
    orsche
    -1.50
     Compet
    -1.49
     Integrity
    -1.48
     merits
    -1.47
     petertodd
    -1.46
    igers
    -1.44
     laure
    -1.44
     advoc
    -1.42
    POSITIVE LOGITS
    stretched
    1.78
     contracting
    1.73
    emi
    1.64
    etting
    1.61
    eming
    1.53
     humming
    1.51
    figure
    1.50
    larg
    1.42
    elong
    1.42
     swelling
    1.41
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.