INDEX
Explanations
phrases that indicate a sense of suspicion or intrigue
references to acts of violence or conflict
New Auto-Interp
Negative Logits
Originally
-0.41
earchers
-0.38
itars
-0.38
puzzled
-0.38
undrum
-0.37
FANTASY
-0.37
urches
-0.36
âĢº
-0.35
Uriel
-0.35
fascinated
-0.34
POSITIVE LOGITS
)).
0.85
.).
0.68
]).
0.68
]."
0.66
}.
0.64
%.
0.62
).
0.58
%).
0.58
)."
0.54
$.
0.53
Activations Density 4.726%