INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Som
    -0.07
    女性
    -0.07
    _term
    -0.07
     ids
    -0.07
     yakın
    -0.07
    [var
    -0.07
     Zeit
    -0.07
     friction
    -0.07
     haircut
    -0.07
     nationalism
    -0.06
    POSITIVE LOGITS
     apost
    0.16
     Apost
    0.13
     Apostle
    0.09
     απο
    0.09
     Follow
    0.08
     disciples
    0.08
     Apocalypse
    0.08
    					 
    0.07
    Oct
    0.07
    ocator
    0.07
    Act Density 0.002%

    No Known Activations