INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.08
1:0.07
2:0.08
3:0.09
4:0.08
5:0.07
6:0.08
7:0.09
8:0.07
9:0.09
10:0.07
11:0.09
Negative Logits
idth
-2.18
appropriately
-1.96
vertisement
-1.93
irs
-1.93
srf
-1.90
spot
-1.87
arching
-1.77
mercial
-1.72
ngth
-1.72
cav
-1.70
POSITIVE LOGITS
Blackwell
1.82
Hir
1.56
Mond
1.54
Carth
1.53
precinct
1.52
histor
1.51
Mot
1.49
Athen
1.44
folklore
1.44
GC
1.41
Activations Density 0.000%
No Known Activations
This feature has no known activations.