INDEX
Explanations
connections and interactions between entities or individuals
New Auto-Interp
Head Attr Weights
0:0.01
1:0.01
2:0.07
3:0.08
4:0.16
5:0.02
6:0.03
7:0.35
8:0.03
9:0.03
10:0.08
11:0.08
Negative Logits
proportion
-1.72
iameter
-1.52
measure
-1.47
diameter
-1.46
strings
-1.46
quant
-1.43
emphasis
-1.43
stemmed
-1.41
currency
-1.41
tide
-1.40
POSITIVE LOGITS
prostitutes
1.65
Agent
1.63
Slaughter
1.53
irlf
1.52
ournal
1.42
Sleeping
1.42
fellow
1.42
befriend
1.40
sophistic
1.39
Enemy
1.38
Activations Density 0.001%