INDEX
    Explanations

    references to data organization and episode structure

    New Auto-Interp
    Head Attr Weights
    0:0.04
    1:0.02
    2:0.04
    3:0.08
    4:0.08
    5:0.08
    6:0.03
    7:0.36
    8:0.06
    9:0.02
    10:0.06
    11:0.06
    Negative Logits
    witz
    -2.90
    livious
    -2.69
    oly
    -2.49
    thouse
    -2.45
    aiman
    -2.38
    abama
    -2.35
    vez
    -2.32
    STATE
    -2.31
    adia
    -2.30
    truth
    -2.27
    POSITIVE LOGITS
     ])
    2.25
     chronological
    2.25
     numbered
    2.22
     Nun
    2.18
     Orient
    2.16
     Era
    2.12
     ply
    2.12
     Organization
    2.11
     Nas
    2.10
     Played
    2.10
    Act Density 0.001%

    No Known Activations