INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     shaped
    -0.73
    WithNo
    -0.71
    gotten
    -0.70
    jected
    -0.67
    stru
    -0.67
     flanked
    -0.65
    iety
    -0.64
     gad
    -0.64
     cru
    -0.64
     psychiat
    -0.64
    POSITIVE LOGITS
    %-
    0.83
    FREE
    0.77
    -+
    0.75
    payer
    0.74
    lein
    0.72
    !/
    0.71
    fps
    0.70
    %
    0.70
    +)
    0.70
     ABV
    0.69
    Act Density 0.057%

    No Known Activations