INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.09
1:0.08
2:0.08
3:0.08
4:0.08
5:0.08
6:0.08
7:0.07
8:0.07
9:0.07
10:0.08
11:0.08
Negative Logits
foundations
-2.95
constitution
-2.70
bedrock
-2.62
license
-2.56
itutional
-2.50
Haram
-2.47
)].
-2.47
Christian
-2.44
Saban
-2.42
atism
-2.42
POSITIVE LOGITS
Slay
2.91
ulla
2.81
Farn
2.78
atell
2.68
Ll
2.67
Lyn
2.66
McL
2.64
Jess
2.63
wives
2.63
ryn
2.61
Activations Density 0.000%
No Known Activations
This feature has no known activations.