INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.06
1:0.07
2:0.08
3:0.09
4:0.08
5:0.07
6:0.09
7:0.07
8:0.08
9:0.07
10:0.10
11:0.07
Negative Logits
Father
-1.88
��極
-1.75
Third
-1.75
UGC
-1.74
GOP
-1.69
DOWN
-1.69
Asset
-1.66
ELF
-1.66
Dad
-1.65
Including
-1.59
POSITIVE LOGITS
]]
1.70
"""
1.55
anwhile
1.51
idth
1.50
ucker
1.47
fing
1.47
etics
1.46
"/>
1.45
'."
1.44
nap
1.43
Activations Density 0.000%
No Known Activations
This feature has no known activations.