INDEX
    Explanations
    New Auto-Interp
    Head Attr Weights
    0:0.08
    1:0.09
    2:0.08
    3:0.07
    4:0.08
    5:0.07
    6:0.07
    7:0.07
    8:0.08
    9:0.09
    10:0.07
    11:0.08
    Negative Logits
     tox
    -2.86
    ultz
    -2.62
    phasis
    -2.57
     pres
    -2.57
     Pillar
    -2.55
    utical
    -2.55
     Philips
    -2.55
    idential
    -2.52
    nant
    -2.52
    Glass
    -2.50
    POSITIVE LOGITS
     Romero
    3.21
     Fey
    2.80
     canvas
    2.74
     streng
    2.74
     Buchanan
    2.62
     Berm
    2.62
     Fah
    2.54
     Boxing
    2.54
     Kum
    2.54
     Christianity
    2.51
    Act Density 0.000%

    No Known Activations