INDEX
Explanations
mentions of different genres
references to various genres of literature or media
New Auto-Interp
Negative Logits
Lumpur
-0.87
urion
-0.87
riel
-0.82
ermanent
-0.72
loo
-0.72
sie
-0.70
administ
-0.68
amen
-0.65
erald
-0.65
bats
-0.64
POSITIVE LOGITS
genre
0.80
genres
0.79
fiction
0.79
ologies
0.79
tropes
0.75
genre
0.74
icity
0.73
conventions
0.73
allo
0.72
¥µ
0.70
Activations Density 0.017%