INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     anthrop
    -0.07
     огром
    -0.07
     alf
    -0.07
    Bet
    -0.07
    Anth
    -0.07
     bead
    -0.07
    -0.07
     protocolo
    -0.07
     onboard
    -0.07
    Grab
    -0.07
    POSITIVE LOGITS
    -ci
    0.08
    -separated
    0.08
     Universities
    0.08
    0.07
    .Course
    0.07
    0.07
     Input
    0.07
    -là
    0.07
     Savior
    0.07
     Colleges
    0.07
    Act Density 0.005%

    No Known Activations