INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Snyder
    -0.07
    itch
    -0.07
    .btnDelete
    -0.07
    Chicago
    -0.07
    ritz
    -0.07
     Selling
    -0.06
    _iteration
    -0.06
     Detective
    -0.06
    geh
    -0.06
    subj
    -0.06
    POSITIVE LOGITS
     warm
    0.22
     Warm
    0.18
    Warm
    0.15
    warm
    0.14
     warmer
    0.13
     warmed
    0.12
     warmth
    0.11
     warming
    0.11
     warmly
    0.10
    .Raw
    0.08
    Act Density 0.010%

    No Known Activations