INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     prepare
    -0.07
    configure
    -0.07
    aney
    -0.07
    Area
    -0.06
     Auditor
    -0.06
    unday
    -0.06
     princess
    -0.06
     Михай
    -0.06
     مور
    -0.06
     organise
    -0.06
    POSITIVE LOGITS
     Self
    0.19
     self
    0.17
    Self
    0.14
     SELF
    0.14
    	self
    0.11
    self
    0.10
    /self
    0.10
    :self
    0.09
    0.09
     selfies
    0.09
    Act Density 0.020%

    No Known Activations