INDEX
Explanations
numbers followed by specific keywords or phrases
numerical values or quantities
New Auto-Interp
Negative Logits
ument
-0.75
gart
-0.67
pastoral
-0.65
relat
-0.64
coat
-0.62
assetsadobe
-0.60
plom
-0.59
savior
-0.59
yield
-0.59
dest
-0.59
POSITIVE LOGITS
ecause
1.07
Wonders
1.04
eenth
0.92
ioned
0.91
th
0.89
DAY
0.89
07
0.89
883
0.88
69
0.87
88
0.86
Activations Density 0.084%