INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    	update
    -0.07
     greeting
    -0.07
     permanently
    -0.06
    /github
    -0.06
    oze
    -0.06
    bet
    -0.06
    une
    -0.06
    //
    -0.06
     oldukları
    -0.06
    -0.06
    POSITIVE LOGITS
    DataSource
    0.12
    .DataSource
    0.11
     DataSource
    0.10
    .dataSource
    0.09
     aspir
    0.08
     datasource
    0.08
     dataSource
    0.07
     asp
    0.07
    isos
    0.07
    _DISP
    0.06
    Act Density 0.003%

    No Known Activations