INDEX
Explanations
phrases that involve multiple facets of issues or topics, particularly those that highlight complexity or connections among different elements
New Auto-Interp
Head Attr Weights
0:0.10
1:0.01
2:0.17
3:0.04
4:0.05
5:0.03
6:0.16
7:0.01
8:0.19
9:0.04
10:0.05
11:0.10
Negative Logits
ensibly
-1.73
Piercing
-1.73
entially
-1.72
040
-1.70
901
-1.69
ェ
-1.67
0010
-1.65
ayne
-1.65
030
-1.64
itely
-1.63
POSITIVE LOGITS
etc
1.97
etc
1.64
ospace
1.58
BlackBerry
1.54
relationships
1.54
Tel
1.52
CTR
1.52
constitu
1.52
opinion
1.50
Mehran
1.50
Activations Density 0.014%