INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.07
1:0.07
2:0.08
3:0.10
4:0.09
5:0.07
6:0.09
7:0.09
8:0.07
9:0.08
10:0.08
11:0.07
Negative Logits
nda
-2.09
ovo
-1.86
dot
-1.79
liv
-1.76
contin
-1.75
zb
-1.72
Sched
-1.71
ubs
-1.71
rone
-1.70
ni
-1.69
POSITIVE LOGITS
educate
1.92
immersed
1.86
architect
1.73
educated
1.71
philanthrop
1.67
devote
1.67
firsthand
1.65
careful
1.63
leap
1.63
educating
1.63
Activations Density 0.000%
No Known Activations
This feature has no known activations.