INDEX
    Explanations

    news; political events

    New Auto-Interp
    Negative Logits
     residual
    -0.08
     wf
    -0.07
    τες
    -0.07
     seguimos
    -0.07
     ir
    -0.07
    єм
    -0.07
    ווא
    -0.07
     dividing
    -0.07
    нок
    -0.07
     often
    -0.07
    POSITIVE LOGITS
     {
    ↵/
    0.09
    ();↵/
    0.08
     logging
    0.08
    logging
    0.08
     /*
    ↵
    0.08
    Logging
    0.08
    ifecycle
    0.08
     {
    
    ↵
    0.08
    _logging
    0.08
     });↵//
    0.08
    Act Density 0.147%

    No Known Activations