INDEX
    Explanations

    Chinese philosophy

    New Auto-Interp
    Negative Logits
    ipy
    -0.08
     phiên
    -0.08
    [][
    -0.07
    .springboot
    -0.07
     gn
    -0.07
    arek
    -0.07
    [['
    -0.07
    	mysql
    -0.07
    pokemon
    -0.07
    ాశ
    -0.07
    POSITIVE LOGITS
     egal
    0.09
     equality
    0.09
     fairness
    0.09
    Equality
    0.08
     Egal
    0.08
     Sliding
    0.08
     Equality
    0.08
     igualdade
    0.08
     sliding
    0.08
     forall
    0.08
    Act Density 0.000%

    No Known Activations