INDEX
    Explanations

    the word "genre" with high activation values

    references to different genres in media

    New Auto-Interp
    Negative Logits
    urion
    -0.85
    riel
    -0.84
     Lumpur
    -0.80
    amen
    -0.71
    ermanent
    -0.69
    erald
    -0.66
     Wilhelm
    -0.64
     Uz
    -0.64
     Lama
    -0.63
    administ
    -0.63
    POSITIVE LOGITS
    genre
    0.82
     genres
    0.81
     fiction
    0.76
    ologies
    0.75
     genre
    0.74
    allo
    0.73
    ¥µ
    0.71
     juices
    0.69
    adelphia
    0.68
    ĸļ
    0.68
    Act Density 0.018%

    No Known Activations