INDEX
Explanations
references to individuals and their actions or experiences
New Auto-Interp
Head Attr Weights
0:0.16
1:0.10
2:0.09
3:0.05
4:0.03
5:0.03
6:0.14
7:0.10
8:0.02
9:0.04
10:0.13
11:0.05
Negative Logits
Frag
-3.31
tank
-2.93
Katy
-2.93
irit
-2.70
mint
-2.65
Frag
-2.64
inert
-2.61
biod
-2.61
tank
-2.54
bab
-2.48
POSITIVE LOGITS
Cunningham
8.81
Cary
3.99
Compton
3.41
Chung
3.40
Cox
3.33
Swanson
3.30
Dunn
3.24
Cov
3.13
Durham
3.12
Glover
3.04
Activations Density 0.002%