INDEX
    Explanations

    attends to definitions or explanatory phrases from the context of political events or terminology

    New Auto-Interp
    Head Attr Weights
    0:0.07
    1:0.09
    2:0.08
    3:0.09
    4:0.08
    5:0.03
    6:0.36
    7:0.16
    Negative Logits
    -0.34
     saites
    -0.30
     بتاريخ
    -0.30
    )");
    
    -0.30
    RegressionTest
    -0.29
    ;">
    
    -0.28
    urally
    -0.28
    SharedDtor
    -0.28
     مرئيه
    -0.27
    \
    
    -0.27
    POSITIVE LOGITS
     sumpay
    0.38
    tableFuture
    0.35
     địch
    0.33
     prisonniers
    0.33
     المعيارى
    0.33
     delantera
    0.33
     ước
    0.32
     BrowserModule
    0.32
    ToScroll
    0.32
    étit
    0.31
    Act Density 0.064%

    No Known Activations