INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Head Attr Weights
    0:0.07
    1:0.08
    2:0.07
    3:0.09
    4:0.09
    5:0.08
    6:0.06
    7:0.09
    8:0.09
    9:0.08
    10:0.08
    11:0.08
    Negative Logits
    endas
    -3.40
    inho
    -3.03
    aucuses
    -2.92
     veto
    -2.90
    ishops
    -2.87
    eto
    -2.85
    mun
    -2.80
    Pope
    -2.75
    ushima
    -2.74
    adra
    -2.73
    POSITIVE LOGITS
     ABE
    3.68
     RL
    2.88
     MM
    2.85
     Stevenson
    2.80
     LI
    2.79
     QC
    2.76
     JR
    2.66
     TN
    2.62
     ALE
    2.62
     WD
    2.61
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.