INDEX
    Explanations

    phrases or words related to something being acceptable or not

    concepts of acceptability and standards

    New Auto-Interp
    Negative Logits
    hunt
    -0.82
    berry
    -0.80
    king
    -0.79
    dream
    -0.78
    hunter
    -0.76
    wan
    -0.74
    set
    -0.73
    hung
    -0.73
    GPU
    -0.72
    older
    -0.72
    POSITIVE LOGITS
     acceptable
    1.04
     agre
    1.00
     undermin
    0.82
    soDeliveryDate
    0.81
     compromises
    0.81
    lihood
    0.81
     mosqu
    0.80
     srfAttach
    0.79
    ible
    0.78
    GoldMagikarp
    0.78
    Act Density 0.007%

    No Known Activations