INDEX
    Explanations

    mentions of specific individuals, particularly focusing on their actions or roles

    New Auto-Interp
    Head Attr Weights
    0:0.10
    1:0.03
    2:0.07
    3:0.03
    4:0.06
    5:0.05
    6:0.22
    7:0.05
    8:0.07
    9:0.20
    10:0.02
    11:0.04
    Negative Logits
     sofa
    -4.03
     fid
    -3.82
    isconsin
    -3.76
     ALEC
    -3.65
     Dia
    -3.49
     couch
    -3.46
     Zelda
    -3.44
     bowling
    -3.42
     Ö
    -3.42
     contrace
    -3.40
    POSITIVE LOGITS
     Hopkins
    10.48
    Hop
    8.56
     Johns
    6.19
     Hop
    5.91
     Patterson
    4.27
     Rogers
    4.23
     Pett
    4.12
     McGregor
    4.09
     Jenkins
    4.03
     Hendricks
    3.96
    Act Density 0.003%

    No Known Activations