INDEX
Explanations
adjectives or nouns related to foundational principles or root causes
references to underlying causes, issues, or factors in various contexts
New Auto-Interp
Negative Logits
aido
-0.90
helm
-0.79
uba
-0.77
Sabha
-0.76
udo
-0.76
hops
-0.76
urate
-0.75
asia
-0.74
zanne
-0.73
hire
-0.73
POSITIVE LOGITS
premise
1.35
principle
1.32
principles
1.29
assumption
1.24
rationale
1.24
structure
1.21
theme
1.20
tenets
1.19
underlying
1.18
motivations
1.16
Activations Density 0.082%