跳转至

Auto

labridge.func_modules.paper.parse.parsers.auto

labridge.func_modules.paper.parse.parsers.auto.auto_parse_paper(file_path, source_analyzer, use_llm_for_source)

Automatically parse a paper according to the analyzed paper source.

PARAMETER DESCRIPTION
file_path

The paper path.

TYPE: Union[str, Path]

source_analyzer

The analyzer that analyze the paper source.

TYPE: PaperSourceAnalyzer

use_llm_for_source

Whether to use LLM in the source_analyzer.

TYPE: bool

RETURNS DESCRIPTION
List[Document]

List[Document]: The parsed paper documents. For example: A paper from Nature will be seperated into these components: ABSTRACT, MAINTEXT, REFERENCES, METHODS.

Source code in labridge\func_modules\paper\parse\parsers\auto.py
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
def auto_parse_paper(
	file_path: Union[str, Path],
	source_analyzer: PaperSourceAnalyzer,
	use_llm_for_source: bool,
) -> List[Document]:
	r"""
	Automatically parse a paper according to the analyzed paper source.

	Args:
		file_path (Union[str, Path]): The paper path.
		source_analyzer (PaperSourceAnalyzer): The analyzer that analyze the paper source.
		use_llm_for_source (bool): Whether to use LLM in the source_analyzer.

	Returns:
		List[Document]: The parsed paper documents.
			For example: A paper from Nature will be seperated into these components:
			`ABSTRACT`, `MAINTEXT`, `REFERENCES`, `METHODS`.
	"""
	paper_source = source_analyzer.analyze_source(file_path, use_llm_for_source)

	if paper_source == PaperSource.NATURE:
		parser = NaturePaperParser()
	elif paper_source == PaperSource.IEEE:
		parser = IEEEPaperParser()
	elif paper_source == PaperSource.DEFAULT:
		parser = DefaultPaperParser()
	else:
		raise ValueError("Invalid paper source.")

	docs = parser.parse_paper(file_path=file_path)
	return docs