INDEX
    Explanations

    movie/acting reviews

    New Auto-Interp
    Negative Logits
     weeds
    -0.07
    .ind
    -0.07
     З
    -0.06
     CCS
    -0.06
     $↵
    -0.06
    -0.06
     Naw
    -0.06
    _;↵↵
    -0.06
     <↵
    -0.06
    ertest
    -0.06
    POSITIVE LOGITS
     spun
    0.07
    0.07
    0.07
    accent
    0.06
     réuss
    0.06
    毛主席
    0.06
    entai
    0.06
     karakter
    0.06
    0.06
    0.06
    Act Density 0.044%

    No Known Activations