INDEX
Explanations
pronouns and references to people or entities in conversations
New Auto-Interp
Head Attr Weights
0:0.02
1:0.03
2:0.13
3:0.20
4:0.02
5:0.05
6:0.05
7:0.08
8:0.13
9:0.09
10:0.08
11:0.08
Negative Logits
ORPG
-1.34
aez
-1.16
sqor
-1.16
arily
-1.11
clerosis
-1.08
orically
-1.07
emort
-1.05
itionally
-1.03
ificantly
-1.02
Siren
-1.02
POSITIVE LOGITS
walls
1.06
bends
1.05
confines
1.02
bushes
0.98
scenes
0.97
junction
0.97
Lines
0.95
Dispatch
0.95
horns
0.94
ropes
0.94
Activations Density 0.013%