INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.07
1:0.06
2:0.08
3:0.09
4:0.09
5:0.07
6:0.10
7:0.07
8:0.08
9:0.08
10:0.08
11:0.08
Negative Logits
Reviewer
-1.81
dden
-1.71
unaccount
-1.64
yss
-1.59
ographically
-1.57
footing
-1.56
iffe
-1.55
ONE
-1.52
ayers
-1.50
steen
-1.49
POSITIVE LOGITS
Cooking
1.85
ilet
1.76
Hosp
1.64
Wars
1.64
Museum
1.53
dragon
1.52
gard
1.49
Chicken
1.49
Fashion
1.48
Roose
1.48
Activations Density 0.000%
No Known Activations
This feature has no known activations.