INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     harb
    -0.08
     recal
    -0.07
     ejected
    -0.07
    	load
    -0.07
     meilleure
    -0.06
     bur
    -0.06
     jente
    -0.06
     phép
    -0.06
     гум
    -0.06
    …but
    -0.06
    POSITIVE LOGITS
     intensive
    0.10
    -intensive
    0.08
     Puppy
    0.06
    Ph
    0.06
    mination
    0.06
    Conv
    0.06
    Php
    0.06
    TypeInfo
    0.06
     volunteer
    0.06
     lsp
    0.06
    Act Density 0.002%

    No Known Activations