INDEX
    Explanations

    phrases and concepts associated with political structures and actions

    New Auto-Interp
    Negative Logits
     similarly
    -0.41
     similar
    -0.35
    similar
    -0.34
    imilar
    -0.34
     podob
    -0.33
     comparable
    -0.31
     Similar
    -0.30
    Similar
    -0.29
     simil
    -0.28
     подоб
    -0.27
    POSITIVE LOGITS
     same
    0.49
    same
    0.49
    Same
    0.48
     Same
    0.47
     mismo
    0.39
     SAME
    0.38
    åIJĮ
    0.34
    _same
    0.33
     misma
    0.33
     mesma
    0.32
    Act Density 0.138%

    No Known Activations