INDEX
Explanations
the word "fiction" or phrases related to fictional content
instances of the word "fiction" in various contexts
New Auto-Interp
Negative Logits
xon
-0.74
hens
-0.71
hap
-0.67
baugh
-0.66
realDonaldTrump
-0.66
arov
-0.64
bred
-0.64
gars
-0.63
umm
-0.62
Parenthood
-0.61
POSITIVE LOGITS
fiction
1.00
anthology
0.93
Fiction
0.87
fiction
0.83
novels
0.82
Writers
0.82
novelist
0.81
writers
0.78
writer
0.78
istically
0.77
Activations Density 0.022%