INDEX
Explanations
statements that refer to a narrative or a sequence of events
mentions of "story."
New Auto-Interp
Negative Logits
ignt
-0.77
orem
-0.76
emale
-0.73
aez
-0.71
inence
-0.69
ynski
-0.68
anyon
-0.65
ardless
-0.62
ategory
-0.61
numeric
-0.61
POSITIVE LOGITS
telling
1.33
te
1.13
boards
1.01
book
1.00
tell
0.98
arc
0.97
board
0.94
arcs
0.90
books
0.89
boarding
0.86
Activations Density 0.086%