0
$\begingroup$

We have a large collection of documents (D), each accompanied by a set of metadata (M). Within this collection, some documents act as parent documents and have multiple child documents. Both parent and child documents are part of the document set D. The number of child documents can vary for each parent document. In the past, humans have manually sorted the child documents of every parent document based on the discretion of the parent and child metadata. Our objective is to develop a machine learning (ML) model that can learn this sorting criteria and predict the sequence of child documents attached to a parent document, utilizing both parent and child metadata (M). Essentially, we aim to infer the relative ordering of child documents associated with a parent.

Currently, we possess a dataset structured as M(Parent), M(Children), Sort_Order. However, we can regenerate/rearrange the dataset to meet the required format. Given this scenario, what strategy should we employ to address this problem?

$\endgroup$
3
  • $\begingroup$ Cross-posted on reddit $\endgroup$ Commented Jun 8, 2023 at 9:27
  • $\begingroup$ @noe yes, got a useful comment there. That led me in the right direction. $\endgroup$ Commented Jun 9, 2023 at 8:22
  • 1
    $\begingroup$ Then, I suggest you may answer your own question here with the correct answer, so that others can also benefit from the knowledge you have gained. $\endgroup$ Commented Jun 9, 2023 at 8:37

1 Answer 1

0
$\begingroup$

Got a useful response elsewhere from @divayjindal and I quote:

This is sort of LTR problem where you have ranked list of list of item w.r.t a query. Standard technique of ListWise LTR might apply here.

$\endgroup$

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.