INDEX
    Explanations

    phrases and concepts related to moral and ethical dilemmas

    New Auto-Interp
    Negative Logits
     متعلقه
    -0.67
     насељу
    -0.63
     ویکی‌آمباردا
    -0.63
    pushFollow
    -0.58
     Bourgoin
    -0.58
     dessutom
    -0.57
     Briefly
    -0.57
    lorum
    -0.57
    [++
    -0.56
     bénévoles
    -0.56
    POSITIVE LOGITS
     just
    0.79
    just
    0.67
    Just
    0.63
     far
    0.63
     Just
    0.61
     merely
    0.60
    far
    0.58
    只是一
    0.54
     далеко
    0.54
     only
    0.53
    Act Density 0.172%

    No Known Activations