INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    .assertj
    -0.07
     Butter
    -0.07
     reluctance
    -0.06
     stirring
    -0.06
     remedies
    -0.06
     rotated
    -0.06
    leurs
    -0.06
    encrypt
    -0.06
    utter
    -0.06
     let
    -0.06
    POSITIVE LOGITS
     domain
    0.12
     Domain
    0.10
     domains
    0.09
    domain
    0.09
    .domain
    0.08
    ам
    0.08
     daytime
    0.08
    -domain
    0.08
    aim
    0.08
     Ports
    0.08
    Act Density 0.013%

    No Known Activations