INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    χό
    -0.07
    -0.06
     тка
    -0.06
    чого
    -0.06
    prend
    -0.06
     drummer
    -0.06
    _world
    -0.06
    	head
    -0.06
     Mädchen
    -0.06
    participant
    -0.06
    POSITIVE LOGITS
     humans
    0.08
    This
    0.07
     humanity
    0.07
    Outline
    0.07
    /es
    0.07
     orally
    0.07
     Automation
    0.07
     });↵↵
    0.07
    ины
    0.07
    acters
    0.06
    Act Density 0.009%

    No Known Activations