INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     administration
    -0.07
     royal
    -0.07
     riding
    -0.07
     Daddy
    -0.07
     Roll
    -0.07
     ruler
    -0.07
    ην
    -0.07
     readily
    -0.06
     rapid
    -0.06
    667
    -0.06
    POSITIVE LOGITS
     exposed
    0.10
     Exposure
    0.10
     exposure
    0.10
     expose
    0.08
     exposing
    0.08
     پوست
    0.07
    0.07
     froze
    0.07
     expo
    0.07
     Eyes
    0.07
    Act Density 0.015%

    No Known Activations