INDEX
Explanations
references to specific characters or elements related to a narrative context
New Auto-Interp
Negative Logits
UTTON
-0.17
rung
-0.16
actionTypes
-0.16
IFF
-0.15
ffen
-0.15
ActionTypes
-0.15
ubu
-0.14
vou
-0.14
Ellison
-0.14
bek
-0.14
POSITIVE LOGITS
491
0.18
uhl
0.16
dale
0.16
leta
0.14
ulls
0.14
plate
0.14
corpor
0.14
opa
0.13
termin
0.13
**/↵↵
0.13
Activations Density 0.003%