INDEX
    Explanations

    words related to colors

    New Auto-Interp
    Negative Logits
    plug
    -0.81
    visor
    -0.75
    wered
    -0.70
    liest
    -0.70
    WARE
    -0.70
     nomine
    -0.68
    ELF
    -0.68
    EStream
    -0.68
    Boot
    -0.65
     compr
    -0.65
    POSITIVE LOGITS
    iseum
    1.02
    ossus
    0.95
    estial
    0.84
    onel
    0.82
    isions
    0.81
    s
    0.79
    ophon
    0.77
    icol
    0.75
    umn
    0.75
    isco
    0.74
    Act Density 0.025%

    No Known Activations