INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.08
1:0.07
2:0.09
3:0.09
4:0.09
5:0.07
6:0.08
7:0.06
8:0.07
9:0.08
10:0.08
11:0.08
Negative Logits
theorem
-1.61
joke
-1.45
MacArthur
-1.45
Ludwig
-1.42
Bloom
-1.42
Stall
-1.41
-1.38
Sherman
-1.35
MSM
-1.35
quote
-1.34
POSITIVE LOGITS
conservancy
2.04
ADRA
1.83
uckland
1.80
tera
1.79
earch
1.78
abwe
1.78
mbuds
1.78
showc
1.78
withd
1.72
confir
1.72
Activations Density 0.000%
No Known Activations
This feature has no known activations.