INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -other
    -0.08
    рон
    -0.08
    aught
    -0.08
    anj
    -0.07
    individual
    -0.07
    other
    -0.07
    technical
    -0.07
    awaii
    -0.07
    Angel
    -0.07
    iking
    -0.07
    POSITIVE LOGITS
     betr
    0.09
     Coins
    0.08
     inici
    0.08
     asum
    0.08
     Initi
    0.08
     Subsid
    0.08
     Assuming
    0.07
     Van
    0.07
    PACT
    0.07
     Stap
    0.07
    Act Density 0.016%

    No Known Activations