INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Head Attr Weights
    0:0.08
    1:0.06
    2:0.08
    3:0.07
    4:0.09
    5:0.07
    6:0.09
    7:0.08
    8:0.08
    9:0.08
    10:0.07
    11:0.08
    Negative Logits
    pez
    -1.91
    eger
    -1.71
    capt
    -1.62
    found
    -1.53
    rys
    -1.52
    clip
    -1.51
    gotten
    -1.50
    ppings
    -1.48
    burning
    -1.45
     Ort
    -1.44
    POSITIVE LOGITS
    Dialogue
    1.81
    )=(
    1.80
    zona
    1.69
    akura
    1.68
     conformity
    1.68
     Decision
    1.66
    Relations
    1.62
     Flavoring
    1.60
     behavi
    1.59
     Commentary
    1.59
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.