INDEX
    Explanations

    national sovereignty and its violation

    New Auto-Interp
    Negative Logits
    ه
    0.73
    ség
    0.66
    うま
    0.66
     рівня
    0.64
    ה
    0.63
    larda
    0.61
    d
    0.61
    lige
    0.61
    0.61
    lendi
    0.60
    POSITIVE LOGITS
     sovereignty
    0.96
     Sovere
    0.65
    జేపీ
    0.61
    '
    0.61
    '.
    0.59
    ست
    0.58
    ';
    0.57
    0.57
    0.57
    '],
    0.57
    Act Density 0.001%

    No Known Activations