INDEX
Explanations
quantifiable metrics or statistical references in discourse
New Auto-Interp
Head Attr Weights
0:0.33
1:0.04
2:0.03
3:0.07
4:0.05
5:0.12
6:0.05
7:0.02
8:0.15
9:0.06
10:0.02
11:0.02
Negative Logits
deserts
-1.81
livion
-1.69
Ruin
-1.66
Default
-1.60
ocalypse
-1.58
scams
-1.58
vertising
-1.57
azo
-1.55
adra
-1.52
Interstitial
-1.52
POSITIVE LOGITS
adherents
2.07
delegates
1.91
gathered
1.90
authority
1.85
electors
1.84
ENTS
1.81
witnesses
1.74
scient
1.73
elf
1.68
武
1.68
Activations Density 0.001%