INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Looks
    -0.07
     broke
    -0.06
     looks
    -0.06
    oref
    -0.06
     hoe
    -0.06
     lacks
    -0.06
     opened
    -0.06
     folding
    -0.06
     Boyd
    -0.06
    -0.06
    POSITIVE LOGITS
     detergent
    0.09
     науки
    0.08
    $link
    0.07
    _thread
    0.07
     PartialEq
    0.07
     dedi
    0.07
     girdi
    0.06
    textAlign
    0.06
    ((__
    0.06
    =__
    0.06
    Act Density 0.009%

    No Known Activations