INDEX
    Explanations

    terms related to social justice issues and personal accountability

    comparison or similarity

    New Auto-Interp
    Negative Logits
     Meksiku
    -0.59
    dius
    -0.49
    <bos>
    -0.47
    -0.47
    cellent
    -0.47
    mbrie
    -0.45
    -0.44
    endio
    -0.43
     cele
    -0.43
     öz
    -0.41
    POSITIVE LOGITS
     differently
    1.43
     así
    1.21
     näin
    1.12
     like
    1.11
     demikian
    1.03
     assim
    1.00
     így
    0.99
     slik
    0.96
    このように
    0.94
     böyle
    0.94
    Act Density 0.500%

    No Known Activations