INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    computed
    -0.07
     lib
    -0.06
    -webpack
    -0.06
     Museum
    -0.06
    Bruce
    -0.06
    webpack
    -0.06
    histoire
    -0.06
    χ
    -0.06
    £
    -0.06
     Hansen
    -0.06
    POSITIVE LOGITS
     in
    0.08
    _POL
    0.07
     Fors
    0.07
    oxic
    0.07
    0.07
     In
    0.06
    ọc
    0.06
    etros
    0.06
    elfast
    0.06
    	cd
    0.06
    Act Density 0.038%

    No Known Activations