INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     manoe
    -0.08
     Baltimore
    -0.07
     bullet
    -0.07
    103
    -0.07
     burden
    -0.07
    		       
    -0.07
     Fault
    -0.07
    		        
    -0.07
     bait
    -0.07
     bat
    -0.06
    POSITIVE LOGITS
     expressed
    0.13
     expresses
    0.12
     expressing
    0.12
     express
    0.12
     expressive
    0.11
     Express
    0.11
     Expression
    0.10
     expression
    0.09
     выраж
    0.09
     expressions
    0.09
    Act Density 0.030%

    No Known Activations