Revision as of 06:43, 27 May 2023

Reinforced Learning with Human Feedback can be abbreviated to RLHF. This is a testing or training approach for data scientists to use human feedback to make sense of or give social meaning to data. One approach to think of RLHF at scale is to include all human annotation and data editing history. This is especially true with a web-based interface that gathers human inputs on mobile networked devices. Furthermore, if all of this history is arranged and stored with block numbers of some public blockchains, the incidence of all of these RLHF operations may be globally integrated to reflect some statistical behavior of collective human intent. This is where individual and collective accountability can be operationalized.

References

Related Pages

Revision as of 06:41, 27 May 2023 (view source) Benkoo2 (talk \| contribs) ← Older edit		Revision as of 06:43, 27 May 2023 (view source) Benkoo2 (talk \| contribs) Newer edit →
Line 1:		Line 1:
	~~[[RLHF]] is also known as [[~~Reinforced Learning with Human Feedback]]. This is a testing or training ~~strategy~~ for data scientists to make sense or ~~assign~~ social meaning to data ~~by human feedback~~. One ~~way~~ to think of [[RLHF]] at scale is to ~~allow~~ all [[human annotation]] and [[data editing history~~]] as a part of [[RLHF]]~~. This is ~~particularly possible~~ with a web-based interface that ~~captures~~ human inputs on ~~portable~~ networked devices. ~~On top of that~~, if all ~~these~~ history ~~are ordered~~ and ~~encoded~~ with block numbers of some public [[blockchain]], the ~~occurrence~~ of all these [[RLHF]] ~~activities could~~ be ~~universally~~ integrated to reflect some statistical behavior of human intent ~~as a collective~~. This is where [[accountability]] ~~of individual and the collective~~ can be ~~operationally executed~~.		Reinforced Learning with Human Feedback can be abbreviated to [[RLHF]]. This is a testing or training approach for data scientists to use human feedback to make sense of or give [[social meaning]] to data. One approach to think of [[RLHF]] at scale is to include all human annotation and data editing history. This is especially true with a web-based interface that gathers human inputs on mobile networked devices. Furthermore, if all of this history is arranged and stored with block numbers of some public [[blockchain]]s, the incidence of all of these [[RLHF]] operations may be globally integrated to reflect some statistical behavior of collective human intent. This is where individual and collective [[accountability]] can be operationalized.

	<noinclude>		<noinclude>

Difference between revisions of "Reinforced Learning with Human Feedback"

Revision as of 06:43, 27 May 2023

References

Related Pages

Navigation menu

Difference between revisions of "Reinforced Learning with Human Feedback"

Revision as of 06:43, 27 May 2023

References

Related Pages

Navigation menu

Search