INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     stern
    -0.72
     tant
    -0.71
     mort
    -0.69
    assic
    -0.67
     fathers
    -0.65
     apex
    -0.64
     patriarch
    -0.64
     culminating
    -0.64
     presumed
    -0.64
     dashed
    -0.63
    POSITIVE LOGITS
    Advertisements
    1.07
    yip
    0.91
     Flavoring
    0.89
    rences
    0.84
    edIn
    0.82
    unicip
    0.82
    arty
    0.79
     Advertisement
    0.78
    culosis
    0.78
    allery
    0.77
    Act Density 0.010%

    No Known Activations