INDEX
Explanations
references to materials, specifically steel
New Auto-Interp
Head Attr Weights
0:0.09
1:0.06
2:0.08
3:0.07
4:0.08
5:0.08
6:0.08
7:0.09
8:0.08
9:0.08
10:0.09
11:0.08
Negative Logits
Reviewer
-1.80
mbuds
-1.80
bisexual
-1.77
��
-1.76
galitarian
-1.74
isexual
-1.74
ilial
-1.72
ilitarian
-1.69
othing
-1.69
conom
-1.68
POSITIVE LOGITS
leaked
2.02
pics
1.81
Moment
1.74
hars
1.65
airing
1.50
quotes
1.47
pumped
1.44
documented
1.42
pic
1.42
highlighted
1.41
Activations Density 0.000%