INDEX
Explanations
certain descriptive phrases or qualities within sentences
New Auto-Interp
Negative Logits
ayan
-0.71
»
-0.71
uku
-0.71
ional
-0.70
Zone
-0.67
ione
-0.67
Contract
-0.65
peat
-0.65
vl
-0.65
seys
-0.64
POSITIVE LOGITS
distinguishes
1.27
inspires
1.20
separates
1.17
motiv
1.15
enables
1.12
defines
1.10
underpin
1.09
attracts
1.09
makes
1.08
drove
1.07
Activations Density 0.162%