INDEX
    Explanations
    New Auto-Interp
    Head Attr Weights
    0:0.03
    1:0.01
    2:0.08
    3:0.07
    4:0.11
    5:0.04
    6:0.04
    7:0.40
    8:0.03
    9:0.04
    10:0.06
    11:0.04
    Negative Logits
    artment
    -1.96
    abase
    -1.79
    ividual
    -1.67
    ategory
    -1.61
    illion
    -1.58
    mouth
    -1.56
    TB
    -1.56
    isd
    -1.56
    ovember
    -1.55
    priority
    -1.54
    POSITIVE LOGITS
     Gleaming
    1.76
     indo
    1.60
     folklore
    1.58
     apartheid
    1.58
     breeze
    1.55
     flakes
    1.55
    ��
    1.52
     routines
    1.49
     chants
    1.47
     memories
    1.46
    Act Density 0.000%

    No Known Activations