INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Angela
    -0.06
    ...">↵
    -0.06
     aged
    -0.06
     satellite
    -0.06
     Typically
    -0.05
     Camera
    -0.05
    rug
    -0.05
    (m
    -0.05
     Ř
    -0.05
     Race
    -0.05
    POSITIVE LOGITS
     foundation
    0.14
     foundations
    0.12
     Foundations
    0.10
     Foundation
    0.09
    FOUNDATION
    0.08
     base
    0.08
     footh
    0.08
     основ
    0.07
    0.07
    .foundation
    0.07
    Act Density 0.013%

    No Known Activations