INDEX
    Explanations

    references to controversial figures and their actions or statements

    New Auto-Interp
    Negative Logits
      
    -1.31
       
    -1.05
          
    -1.00
           
    -0.99
     utilising
    -0.99
         
    -0.95
     utilised
    -0.95
    -0.92
    ––
    -0.91
     utilise
    -0.90
    POSITIVE LOGITS
     XNUMX
    1.50
     ​​
    1.36
    ̵
    1.32
    NUMX
    1.29
    🇧
    0.97
    0.93
     .;
    0.91
     և
    0.90
     ».
    0.90
     .:
    0.89
    Act Density 0.054%

    No Known Activations