INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Head Attr Weights
    0:0.08
    1:0.06
    2:0.10
    3:0.07
    4:0.08
    5:0.08
    6:0.08
    7:0.07
    8:0.08
    9:0.08
    10:0.08
    11:0.08
    Negative Logits
     Isaac
    -1.95
     Metatron
    -1.80
     Gutenberg
    -1.80
     Franch
    -1.72
     Rebell
    -1.64
     Militia
    -1.61
     Saras
    -1.60
     Muse
    -1.57
     Iro
    -1.53
     Arri
    -1.52
    POSITIVE LOGITS
    ogether
    1.84
    ornia
    1.81
    quartered
    1.81
    ottesville
    1.77
    ackets
    1.75
    arton
    1.75
    orously
    1.75
    arnaev
    1.74
    rans
    1.74
    together
    1.73
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.