INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.08
1:0.09
2:0.08
3:0.08
4:0.08
5:0.08
6:0.08
7:0.07
8:0.08
9:0.05
10:0.09
11:0.07
Negative Logits
orgetown
-1.91
]);
-1.90
Interior
-1.77
oland
-1.76
]).
-1.60
eton
-1.59
enum
-1.59
])
-1.59
Colon
-1.57
]),
-1.55
POSITIVE LOGITS
playbook
1.68
coordin
1.61
kits
1.61
bat
1.58
kit
1.56
gel
1.54
deed
1.52
request
1.49
repl
1.49
acronym
1.47
Activations Density 0.000%
No Known Activations
This feature has no known activations.