INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.08
    ادة
    -0.08
    éral
    -0.08
    flammatory
    -0.08
     cloak
    -0.08
    streeks
    -0.07
     hidden
    -0.07
    áles
    -0.07
     сипат
    -0.07
     vart
    -0.07
    POSITIVE LOGITS
     Intelli
    0.08
     microbes
    0.08
    chappen
    0.08
     Indy
    0.08
     onboarding
    0.08
    Merge
    0.08
    -writing
    0.08
     provisioning
    0.08
     proofreading
    0.08
     inoc
    0.08
    Act Density 0.001%

    No Known Activations