Moral Graph Elicitation
Literature
- https://arxiv.org/abs/2404.10636 - Original paper
- Optional
- https://arxiv.org/pdf/2406.07814 - Collective CAI
Discussion
- What does "value cards" mean? What are moral principles?
- values as choice criteria, choice theory
- Taylor 1977
- Other theorists of choice and agency (Chang, 2004a; Levi, 1990)
- principle of up-weighting statements that gain broad support across diverse clusters of participants, rather than the statements with the most total votes, is often called bridging (Ovadya and Thorburn, 2023)
- Social choice theory
- market-like, and bargaining-based (Howard (1992))
- Wisdom !

- Moral learning can be understood as a gain in wisdom, a locally-justified transition from one set of values and contexts to another, without referencing an ultimate grounding or universal rule (Taylor, 1995)
- If a target gathered from n people has good values, it should have even better values (relevant to more contexts, with more precision, and wiser) when gathered from n + ϵ
- Values
- Values are anything that are not merely instrumental (what does it mean for X to be instrumental)
- Values can be looked at as a three-item tuple (Context, Value, Constitutive Attentional Policies) -> CAP are APs which are about values (thus, not merely instrumental)
- Moral Graph
- !

- Moral Graph Elicitation
- Instead of humans asking for wiser moral values, implement all of this using LLMs