INDEX
Explanations
references to specific characters and their traits in text
New Auto-Interp
Negative Logits
ego
-0.17
etary
-0.17
egin
-0.17
.sharedInstance
-0.17
erton
-0.17
ary
-0.16
enberg
-0.16
enas
-0.15
air
-0.15
ery
-0.15
POSITIVE LOGITS
istically
0.25
isation
0.20
ised
0.19
izations
0.18
pent
0.18
med
0.17
untime
0.17
oque
0.16
nels
0.16
ually
0.16
Activations Density 0.049%