INDEX
    Explanations

    issues and problems related to inequality and disadvantage in various contexts

    New Auto-Interp
    Negative Logits
     addCriterion
    -0.19
     Dit
    -0.15
    onica
    -0.15
    æ³ģ
    -0.14
    idis
    -0.14
    abbr
    -0.14
    åĪ
    -0.14
    ept
    -0.14
     entail
    -0.14
    δί
    -0.13
    POSITIVE LOGITS
     may
    0.20
     can
    0.20
     explains
    0.17
     naturally
    0.17
     was
    0.17
     is
    0.17
     might
    0.17
     isn
    0.16
     could
    0.16
     sometimes
    0.16
    Act Density 0.183%

    No Known Activations