INDEX
Explanations
words and phrases associated with names or identities
New Auto-Interp
Head Attr Weights
0:0.03
1:0.04
2:0.08
3:0.28
4:0.03
5:0.02
6:0.13
7:0.10
8:0.05
9:0.07
10:0.06
11:0.05
Negative Logits
ALLY
-1.22
RELEASE
-1.21
contrace
-1.18
ATTLE
-1.17
multiplication
-1.15
Failure
-1.14
SUPPORT
-1.14
prelim
-1.13
ドラゴン
-1.12
CLASSIFIED
-1.12
POSITIVE LOGITS
iceps
1.23
Gustav
1.22
ukong
1.21
uid
1.20
Blumenthal
1.14
én
1.14
unal
1.13
rer
1.12
rez
1.11
Bran
1.11
Activations Density 0.006%