INDEX
    Explanations

    phrases related to perception and experience

    New Auto-Interp
    Head Attr Weights
    0:0.01
    1:0.05
    2:0.14
    3:0.08
    4:0.02
    5:0.10
    6:0.12
    7:0.11
    8:0.10
    9:0.05
    10:0.08
    11:0.08
    Negative Logits
    ivating
    -1.12
    ivated
    -1.06
    ascus
    -1.05
    ription
    -1.04
     sidx
    -1.03
    added
    -0.99
     leaflets
    -0.99
    Loading
    -0.98
    aucus
    -0.97
    asio
    -0.96
    POSITIVE LOGITS
     prope
    0.99
    ulhu
    0.97
     fireball
    0.96
     Deus
    0.96
    ',"
    0.96
    }.
    0.95
    0.94
    !/
    0.93
    $$
    0.93
     Doll
    0.91
    Act Density 0.113%

    No Known Activations