INDEX
Explanations
content related to references or citations
New Auto-Interp
Head Attr Weights
0:0.08
1:0.09
2:0.07
3:0.08
4:0.07
5:0.06
6:0.08
7:0.06
8:0.08
9:0.09
10:0.09
11:0.08
Negative Logits
Seb
-3.76
Cous
-3.17
Masquerade
-3.11
Zeit
-3.01
Cheong
-2.92
Darling
-2.81
Spoon
-2.81
acist
-2.78
Mens
-2.78
Ming
-2.77
POSITIVE LOGITS
generator
3.02
generators
2.92
module
2.71
warr
2.71
reactor
2.62
ugi
2.59
module
2.56
blocker
2.55
player
2.51
URI
2.50
Activations Density 0.000%