INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    _pattern
    -0.07
    mod
    -0.07
     APC
    -0.07
    _light
    -0.06
    -fr
    -0.06
    _fh
    -0.06
     casc
    -0.06
    感情
    -0.06
    ())/
    -0.06
     ELEMENT
    -0.06
    POSITIVE LOGITS
    i
    0.07
    ai
    0.07
     hired
    0.07
    resher
    0.06
    Coverage
    0.06
     schema
    0.06
    0.06
     окруж
    0.06
     unser
    0.06
    syntax
    0.06
    Act Density 0.001%

    No Known Activations