INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.07
    Cold
    -0.06
    783
    -0.06
    _prot
    -0.06
     writing
    -0.06
    ong
    -0.06
     imprint
    -0.06
    Threads
    -0.06
    (filePath
    -0.06
     extensive
    -0.06
    POSITIVE LOGITS
     soccer
    0.18
     Soccer
    0.17
    occer
    0.10
     vocab
    0.07
    _soc
    0.07
    soc
    0.07
     kosher
    0.07
     Rice
    0.07
     "|
    0.06
    -registration
    0.06
    Act Density 0.002%

    No Known Activations