INDEX
    Explanations

    words related to temperature or emotions

    expressions of warmth or warmth-related concepts

    New Auto-Interp
    Negative Logits
    argon
    -0.75
     IMAGES
    -0.74
    FIL
    -0.72
    tumblr
    -0.68
    RECT
    -0.67
    dom
    -0.67
    sections
    -0.67
    issors
    -0.66
    doms
    -0.66
    Os
    -0.65
    POSITIVE LOGITS
     warm
    1.34
    warm
    1.24
    achine
    1.23
     warmer
    1.11
     warmth
    1.07
     warming
    1.02
     Warm
    1.01
     warmed
    0.97
     fuzz
    0.96
    est
    0.89
    Act Density 0.012%

    No Known Activations