INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.08
1:0.07
2:0.09
3:0.07
4:0.10
5:0.06
6:0.10
7:0.07
8:0.08
9:0.07
10:0.08
11:0.08
Negative Logits
itans
-1.68
odium
-1.58
lam
-1.57
amphetamine
-1.52
Californ
-1.51
hetamine
-1.46
Retrieved
-1.45
veyard
-1.43
encia
-1.40
amia
-1.40
POSITIVE LOGITS
Agent
2.11
neighb
2.08
invis
1.81
horizont
1.74
Roose
1.66
appropriately
1.61
perpend
1.59
Together
1.57
robe
1.47
enthusi
1.46
Activations Density 0.000%
No Known Activations
This feature has no known activations.