INDEX
Explanations
examples of imaginative or fictional elements in narratives
New Auto-Interp
Negative Logits
-
-0.33
---
-0.29
âĢIJ
-0.27
-↵
-0.27
---↵
-0.26
-↵↵
-0.25
—
-0.25
ãĢľ
-0.25
âĪĴ
-0.24
—↵
-0.24
POSITIVE LOGITS
–
0.64
–↵↵
0.48
–and
0.45
Âĸ
0.38
.–
0.35
->
0.31
--
0.29
--[
0.25
ÙĢ
0.24
ÙĢÙĦ
0.23
Activations Density 0.059%