INDEX
    Explanations

    phrases that refer to interactions within political or religious communities

    New Auto-Interp
    Negative Logits
    awns
    -0.15
    ataires
    -0.14
    ableView
    -0.14
    à¹īà¸Ńย
    -0.14
    ekim
    -0.14
    è³¢
    -0.14
    icrous
    -0.13
    .reducer
    -0.13
    ãĥ³ãĥĶ
    -0.13
    unkt
    -0.13
    POSITIVE LOGITS
     bounds
    0.48
     confines
    0.43
     boundaries
    0.43
     framework
    0.42
     limits
    0.39
     walls
    0.38
    framework
    0.35
     context
    0.34
     frameworks
    0.32
    bounds
    0.32
    Act Density 0.056%

    No Known Activations