INDEX
    Explanations

    justice, injustices, judiciary

    New Auto-Interp
    Negative Logits
    itzer
    -0.12
    sWith
    -0.09
     culpa
    -0.08
    sm
    -0.08
     scratch
    -0.08
    /stdc
    -0.08
    thic
    -0.08
    Ñħови
    -0.08
    pra
    -0.08
    bis
    -0.08
    POSITIVE LOGITS
    ifiable
    0.16
    ous
    0.14
    ifi
    0.14
    icial
    0.12
    ices
    0.12
    /right
    0.12
    ously
    0.12
    iciary
    0.12
    iros
    0.12
     distrib
    0.11
    Act Density 0.021%

    No Known Activations