INDEX
Explanations
phrases related to beliefs or perspectives
phrases related to misleading beliefs or misconceptions
New Auto-Interp
Negative Logits
actionGroup
-0.74
phabet
-0.70
ZIP
-0.66
Miscellaneous
-0.66
gore
-0.65
illary
-0.64
NCT
-0.61
collisions
-0.60
NetMessage
-0.60
Jarvis
-0.59
POSITIVE LOGITS
assumption
1.44
belief
1.41
expectation
1.13
believing
1.10
assumptions
1.10
pessim
1.08
optimism
1.05
beliefs
1.05
impression
1.04
presumption
1.04
Activations Density 0.972%