INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.08
1:0.08
2:0.08
3:0.08
4:0.08
5:0.09
6:0.08
7:0.08
8:0.08
9:0.07
10:0.09
11:0.07
Negative Logits
Aires
-2.71
ulner
-2.46
bright
-2.43
Soph
-2.33
irens
-2.30
scl
-2.27
hea
-2.25
tmp
-2.23
Bohem
-2.22
nel
-2.19
POSITIVE LOGITS
Publication
2.63
ipedia
2.51
textual
2.50
encyclopedia
2.46
preferential
2.39
iety
2.37
aggregate
2.32
favorably
2.30
Publications
2.26
Article
2.26
Activations Density 0.000%
No Known Activations
This feature has no known activations.