INDEX
    Explanations

    words related to colors

    New Auto-Interp
    Negative Logits
    doms
    -0.86
    uthor
    -0.82
    _-
    -0.81
    olicy
    -0.72
    cffffcc
    -0.66
     RELE
    -0.65
    ernel
    -0.63
    llah
    -0.63
    =-=-
    -0.61
    OTAL
    -0.60
    POSITIVE LOGITS
    blind
    1.37
     palette
    1.19
     scheme
    0.98
    ="#
    0.97
    ation
    0.94
     coded
    0.93
    imeter
    0.93
    way
    0.91
    =#
    0.91
     blindness
    0.91
    Act Density 0.064%

    No Known Activations