跳转至

Labridge

Nature parser

Labridge

主页
功能模块
功能模块
- Papers
  Papers
  - Shared papers
    
    Shared papers
    
    文献内容解析与提取
    
    共享文献库构建
    
    共享文献库检索
  - Personal recent papers
    
    Personal recent papers
    
    个人近期文献库结构
    
    个人临时文献库检索
    
    Download
    Download
    
    在arXiv上检索与下载文献
- Chat history
  Chat history
  - 短期记忆
  - Long-term history
    Long-term history
    
    交互日志存储结构
    
    交互日志检索
- Experiment logs
  Experiment logs
  - Personal experiment logs
    
    Personal experiment logs
    
    个人实验日志存储结构
    
    个人实验日志检索
  - Shared experiment logs
    
    Shared experiment logs
- Instruments
  Instruments
  - Store
  - Retrieve
- References
  References
  - 参考仪器文档
  - 参考文献
Agent与可用工具
Agent与可用工具
- Agent提示词框架
- Tools
  Tools
  - Base
    Base
    
    各种Tools的基类
    
    工具调用日志
  - Chat history
    Chat history
    
    SharedPaperRetrieverTool
  - Experiment log
    Experiment log
    
    CreateNewExperimentLogTool
    
    ExperimentLogRetrieveTool
    
    RecordExperimentLogTool
    
    SetCurrentExperimentTool
  - Interact
    Interact
    
    CollectAndAuthorizeTool
  - Shared papers
    Shared papers
    
    SharedPaperRetrieverTool
  - Temporary papers
    Temporary papers
    
    AddNewRecentPaperTool
    
    ArXivSearchDownloadTool
    
    RecentPaperRetrieveTool
    
    RecentPaperSummarizeTool
项目部署
项目部署
用户界面
用户界面
- Server-Client
- APP
- Web UI
应用展示
应用展示
- Paper
  Paper
- Experiment log
  Experiment log
  - 实验日志记录
  - 实验日志 QA
- Instrument
  Instrument
  - 仪器信息 QA
  - Instrument operations
- Developer mode
  Developer mode
  - 在 Acting phase 评论
  - 在 reasoning phase 指导思考
源码文档
源码文档
- Accounts
  Accounts
  - Super users
  - Users
- Agent
  Agent
  - Chat agent
  - Chat Msg
    Chat Msg
    
    Msg types
  - ReAct
    ReAct
    
    Prompt
    
    React
    
    React chat format
    
    React step
- Callback
  Callback
  - Base
    Base
    
    Operation base
    
    Operation log
  - Experiment_log
    Experiment_log
    
    New experiment
    
    Set current experiment
  - Paper
    Paper
    
    Add recent paper
    
    Paper download
    
    Paper summarize
- Common
  Common
  - Prompt
    Prompt
    
    Llm doc choice select
  - Query_engine
    Query_engine
    
    Query engines
  - Utils
    Utils
    
    Chat
    
    Time
- Func_modules
  Func_modules
  - Instrument
    Instrument
    
    Prompt
    Prompt
    
    Llm instrument choice select
    
    Retrieve
    Retrieve
    
    Instrument retriever
    
    Store
    Store
    
    Instrument store
  - Memory
    Memory
    
    Base
    
    Chat
    Chat
    
    Chat memory
    
    Retrieve
    
    Short memory
    
    Experiment
    Experiment
    
    Experiment log
    
    Retrieve log
  - Paper
    Paper
    
    Download
    Download
    
    Arxiv
    
    Async utils
    
    Parse
    Parse
    
    Paper reader
    
    Extractors
    Extractors
    
    Metadata extract
    
    Source analyze
    
    Parsers
    Parsers
    
    Auto
    
    Base
    
    Default parser
    
    Ieee parser
    
    Nature parser Nature parser
    目录
    
    nature_parser
    
    NaturePaperParser
    
    parse_title
    
    Prompt
    Prompt
    
    Store
    Store
    
    Dir summary
    
    Synthesize
    Synthesize
    
    Paper summarize
    
    Synthesize
    
    Retrieve
    Retrieve
    
    Paper retriever
    
    Shared paper retrieve
    
    Temporary paper retriever
    
    Store
    Store
    
    Paper store
    
    Shared paper store
    
    Temporary store
    
    Synthesizer
    Synthesizer
    
    Summarize
  - Reference
    Reference
    
    Base
    
    Instrument
    
    Paper
- Interact
  Interact
  - Authorize
    Authorize
    
    Authorize
  - Collect
    Collect
    
    Collector
    Collector
    
    Common collector
    
    Select collector
    
    Manager
    Manager
    
    Collect manager
    
    Types
    Types
    
    Common info
    
    Info base
    
    Select info
    
    Pipeline
    
    Utils
- Interface
  Interface
  - Http server
  - Utils
- Models
  Models
  - Local
    Local
    
    Mindspore models
  - Remote
    Remote
    
    Remote models
    
    Remote server
- Tools
  Tools
  - Base
    Base
    
    Function base tools
    
    Tool base
    
    Tool log
  - Common
    Common
    
    Date time
  - Instrument
    Instrument
    
    Retrieve
  - Interact
    Interact
    
    Collect and authorize
  - Memory
    Memory
    
    Chat
    Chat
    
    Retrieve
    
    Experiment
    Experiment
    
    Insert
    
    Retrieve
  - Paper
    Paper
    
    Download
    Download
    
    Arxiv download
    
    Shared_papers
    Shared_papers
    
    Query
    
    Retriever
    
    Utils
    
    Temporary_papers
    Temporary_papers
    
    Insert
    
    Paper retriever
    
    Paper summarize
  - Utils

Nature parser

`labridge.func_modules.paper.parse.parsers.nature_parser` ¶

`labridge.func_modules.paper.parse.parsers.nature_parser.NaturePaperParser` ¶

Bases: BasePaperParser

Parse the paper according to the Nature template.

PARAMETER	DESCRIPTION
`separators`	Each tuple includes the separators that separate two components. Defaults to `NATURE_SEPARATORS`. TYPE: `List[Tuple[str]]` DEFAULT: `None`
`content_names`	Defaults to `NATURE_CONTENT_NAMES`. TYPE: `Dict[int, Tuple[str]` DEFAULT: `None`
`separator_tolerance`	The tolerance of mismatch chars. TYPE: `int` DEFAULT: `3`

Source code in labridge\func_modules\paper\parse\parsers\nature_parser.py

class NaturePaperParser(BasePaperParser):
	r"""
	Parse the paper according to the Nature template.

	Args:
		separators (List[Tuple[str]]): Each tuple includes the separators that separate two components.
			Defaults to `NATURE_SEPARATORS`.
		content_names (Dict[int, Tuple[str]): Key: component index; Value: component name candidates.
			Defaults to `NATURE_CONTENT_NAMES`.
		separator_tolerance (int): The tolerance of mismatch chars.
	"""
	def __init__(self,
				 separators: List[Tuple[str]] = None,
				 content_names: Dict[int, Tuple[str]] = None,
				 separator_tolerance: int = 3):
		separators = separators or NATURE_SEPARATORS
		content_names = content_names or NATURE_CONTENT_NAMES
		super().__init__(separators, content_names, separator_tolerance)

	def parse_title(self, file_path: Union[str, Path]) -> str:
		r""" Suggest to use LLM to extract title and other information. """
		doc = pymupdf.open(file_path)
		toc = doc.get_toc()
		title = None
		try:
			while isinstance(toc[0], list):
				toc = toc[0]
				title = toc[1]
		except IndexError:
			print(f">>> PyMupdf failed to get toc from {file_path}")
		return title

`labridge.func_modules.paper.parse.parsers.nature_parser.NaturePaperParser.parse_title(file_path)` ¶

Suggest to use LLM to extract title and other information.

Source code in labridge\func_modules\paper\parse\parsers\nature_parser.py

def parse_title(self, file_path: Union[str, Path]) -> str:
	r""" Suggest to use LLM to extract title and other information. """
	doc = pymupdf.open(file_path)
	toc = doc.get_toc()
	title = None
	try:
		while isinstance(toc[0], list):
			toc = toc[0]
			title = toc[1]
	except IndexError:
		print(f">>> PyMupdf failed to get toc from {file_path}")
	return title