INDEX
Explanations
words related to actions, decisions, and consequences
phrases indicating the existence of facts or ongoing states
New Auto-Interp
Negative Logits
Manifest
-0.70
Communities
-0.68
roid
-0.66
styles
-0.65
Capital
-0.64
transforms
-0.63
Unic
-0.63
Places
-0.62
Economy
-0.61
Citiz
-0.61
POSITIVE LOGITS
now
1.09
now
0.91
NOW
0.80
externalActionCode
0.79
hindsight
0.79
currently
0.76
psey
0.76
raf
0.76
bol
0.74
bn
0.72
Activations Density 1.113%