INDEX
    Explanations
    New Auto-Interp
    Head Attr Weights
    0:0.08
    1:0.09
    2:0.07
    3:0.09
    4:0.08
    5:0.08
    6:0.07
    7:0.08
    8:0.08
    9:0.09
    10:0.08
    11:0.07
    Negative Logits
     Firefly
    -2.76
     Shadows
    -2.71
     Tsukuyomi
    -2.70
     inadvertently
    -2.66
     Sgt
    -2.65
     Fletcher
    -2.63
     Earl
    -2.63
     Eddie
    -2.61
    rower
    -2.59
     Transformers
    -2.55
    POSITIVE LOGITS
    aur
    3.27
    Sov
    2.84
     Azerbaijan
    2.79
    Syria
    2.78
    India
    2.78
    ��
    2.77
    Iran
    2.75
     Armenia
    2.73
    rique
    2.72
    enough
    2.71
    Act Density 0.000%

    No Known Activations