INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    DOS
    -0.68
     Govern
    -0.67
    aution
    -0.66
    kson
    -0.66
    ategory
    -0.66
    entric
    -0.66
    agonists
    -0.66
    orld
    -0.65
    FINE
    -0.64
    Govern
    -0.63
    POSITIVE LOGITS
    fruit
    1.24
     fruit
    1.09
    cake
    1.06
     juice
    1.05
     fruits
    0.97
    cakes
    0.94
    ruit
    0.87
    fulness
    0.84
    nect
    0.83
     juices
    0.79
    Act Density 0.019%

    No Known Activations