INDEX
Explanations
names of individuals and entities mentioned in the text
New Auto-Interp
Head Attr Weights
0:0.09
1:0.04
2:0.08
3:0.16
4:0.14
5:0.04
6:0.02
7:0.05
8:0.04
9:0.03
10:0.22
11:0.03
Negative Logits
imum
-2.29
beginner
-2.26
phthal
-2.18
isite
-2.16
isable
-2.15
remote
-2.15
species
-2.14
isoft
-2.12
existent
-2.09
functional
-2.09
POSITIVE LOGITS
disagreed
3.04
apologized
2.96
testified
2.84
VIDEOS
2.82
spoke
2.81
cheered
2.71
UNCLASSIFIED
2.70
disagrees
2.69
refuted
2.69
objected
2.68
Activations Density 0.427%