INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    flater
    -0.07
    大学
    -0.06
    Char
    -0.06
     tasty
    -0.06
    -0.06
     layui
    -0.06
    -Se
    -0.06
    ्ञ
    -0.06
    -0.06
     cloth
    -0.06
    POSITIVE LOGITS
     brightest
    0.07
     Enter
    0.07
    PRI
    0.07
     erfahren
    0.06
     biblical
    0.06
     Identify
    0.06
    ]',
    0.06
    0.06
     Testing
    0.06
    rei
    0.06
    Act Density 0.005%

    No Known Activations