INDEX
    Explanations

    color codes in hexadecimal format

    New Auto-Interp
    Negative Logits
    fffffff
    -0.17
    ãĥ¼ãĤº
    -0.16
    zen
    -0.15
     hub
    -0.15
     z
    -0.15
     humane
    -0.14
    eref
    -0.14
     Hub
    -0.14
     fun
    -0.13
    ipop
    -0.13
    POSITIVE LOGITS
    uw
    0.17
    00
    0.16
    idd
    0.15
    660
    0.15
    batis
    0.14
    æŁ
    0.14
    66
    0.14
    ParameterValue
    0.14
    .datab
    0.14
     chatt
    0.14
    Act Density 0.014%

    No Known Activations