INDEX
Explanations
numerical values or references throughout the text
New Auto-Interp
Head Attr Weights
0:0.02
1:0.01
2:0.10
3:0.05
4:0.05
5:0.04
6:0.18
7:0.19
8:0.04
9:0.04
10:0.14
11:0.08
Negative Logits
anat
-1.46
trademarks
-1.45
apost
-1.41
hackers
-1.39
ustainable
-1.39
fax
-1.37
clutch
-1.37
gel
-1.36
welcome
-1.34
holster
-1.34
POSITIVE LOGITS
Balanced
1.63
Discovery
1.49
totality
1.47
ngth
1.42
abba
1.40
Layout
1.38
specified
1.37
aukee
1.37
Survivor
1.37
Conditions
1.36
Activations Density 0.021%