INDEX
Explanations
names or terms related to specific individuals or characters
repeated references to specific names or titles
New Auto-Interp
Negative Logits
oun
-0.83
inventoryQuantity
-0.73
Flavoring
-0.73
edIn
-0.67
itely
-0.66
acles
-0.66
hips
-0.66
edin
-0.64
anke
-0.64
nesday
-0.63
POSITIVE LOGITS
trap
0.84
Dum
0.74
BOOK
0.71
Ñĭ
0.71
urat
0.70
Chap
0.69
Book
0.69
ments
0.69
Topic
0.67
aign
0.67
Activations Density 0.022%