INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.08
1:0.10
2:0.07
3:0.11
4:0.08
5:0.08
6:0.06
7:0.06
8:0.06
9:0.10
10:0.08
11:0.08
Negative Logits
���
-2.30
Watson
-1.92
Barrett
-1.85
debunked
-1.76
Torch
-1.72
McGu
-1.67
vanquished
-1.67
McAuliffe
-1.60
battled
-1.58
supplemented
-1.56
POSITIVE LOGITS
illy
2.00
hur
1.90
nah
1.87
inventoryQuantity
1.84
cookie
1.81
Shame
1.76
フォ
1.75
iche
1.75
OME
1.75
stru
1.74
Activations Density 0.000%
No Known Activations
This feature has no known activations.