INDEX
Explanations
phrases related to values and beliefs, particularly focusing on alignment with personal or organizational values
references to personal or collective values
New Auto-Interp
Negative Logits
owl
-0.72
ilant
-0.69
agh
-0.68
Tunnel
-0.66
enary
-0.65
eor
-0.65
ittle
-0.65
dry
-0.64
sight
-0.64
availability
-0.63
POSITIVE LOGITS
esp
1.05
tenets
0.97
underpin
0.97
beliefs
0.96
ideals
0.96
embodied
0.95
preached
0.94
clash
0.92
indoctr
0.90
guiding
0.87
Activations Density 0.131%