Skip to content

Command Line Interface

Once you have installed the blueprint, you can use it from the CLI.

You can either provide the path to a configuration file:

structured-qa --from_config "example_data/config.yaml"

Or provide values to the arguments directly:

structured-qa \
--question "What learning rate was used?" \
--input_file "example_data/1706.03762v7.pdf" \
--output_folder "example_outputs/1706.03762v7.pdf"

structured_qa.cli.structured_qa(question, input_file=None, output_dir=None, model='bartowski/Qwen2.5-3B-Instruct-GGUF/Qwen2.5-3B-Instruct-f16.gguf', find_prompt=FIND_PROMPT, answer_prompt=ANSWER_PROMPT, from_config=None)

Structured Question Answering.

Split the input document into sections and answer the question based on the sections.

Parameters:

Name Type Description Default
input_file str | None

Path to the input document.

None
output_dir str | None

Path to the output directory. Structure of the output directory:

output_dir/
    section_1.txt
    section_2.txt
    ...
None
model str | None

Model identifier formatted as owner/repo/file. Must be hosted at the HuggingFace Hub in GGUF format.

'bartowski/Qwen2.5-3B-Instruct-GGUF/Qwen2.5-3B-Instruct-f16.gguf'
question str

The question to answer.

required
find_prompt str

The prompt for finding the section.

See FIND_PROMPT.

FIND_PROMPT
answer_prompt str

The prompt for answering the question.

See ANSWER_PROMPT.

ANSWER_PROMPT
from_config str | None

The path to the config file.

If provided, all other arguments will be ignored.

None
Source code in src/structured_qa/cli.py
@logger.catch(reraise=True)
def structured_qa(
    question: str,
    input_file: str | None = None,
    output_dir: str | None = None,
    model: str
    | None = "bartowski/Qwen2.5-3B-Instruct-GGUF/Qwen2.5-3B-Instruct-f16.gguf",
    find_prompt: str = FIND_PROMPT,
    answer_prompt: str = ANSWER_PROMPT,
    from_config: str | None = None,
):
    """
    Structured Question Answering.

    Split the input document into sections and answer the question based on the sections.

    Args:
        input_file: Path to the input document.
        output_dir: Path to the output directory.
            Structure of the output directory:

            ```
            output_dir/
                section_1.txt
                section_2.txt
                ...
            ```
        model: Model identifier formatted as `owner/repo/file`.
            Must be hosted at the HuggingFace Hub in GGUF format.
        question: The question to answer.
        find_prompt: The prompt for finding the section.

            See [`FIND_PROMPT`][structured_qa.config.FIND_PROMPT].
        answer_prompt: The prompt for answering the question.

            See [`ANSWER_PROMPT`][structured_qa.config.ANSWER_PROMPT].
        from_config: The path to the config file.

            If provided, all other arguments will be ignored.
    """
    if from_config:
        raw_config = yaml.safe_load(Path(from_config).read_text())
        Path(raw_config["output_dir"]).mkdir(exist_ok=True, parents=True)
        config = Config.model_validate(raw_config)
    else:
        Path(output_dir).mkdir(exist_ok=True, parents=True)
        config = Config(
            input_file=input_file,
            output_dir=output_dir,
            model=model,
            find_prompt=find_prompt,
            answer_prompt=answer_prompt,
        )

    logger.info("Loading and converting to sections")
    document_to_sections_dir(config.input_file, config.output_dir)
    logger.success("Done")

    logger.info("Loading Model")
    model = load_llama_cpp_model(config.model)
    logger.success("Done")

    logger.info("Answering")
    answer, sections_checked = find_retrieve_answer(
        model=model,
        sections_dir=config.output_dir,
        question=question,
        find_prompt=config.find_prompt,
        answer_prompt=config.answer_prompt,
    )
    logger.success("Done")

    logger.info("Sections checked:")
    logger.info(sections_checked)
    logger.info("Answer:")
    logger.info(answer)

structured_qa.config.Config

Bases: BaseModel

Source code in src/structured_qa/config.py
class Config(BaseModel):
    input_file: FilePath
    output_dir: DirectoryPath
    model: Annotated[str, AfterValidator(validate_model)]
    answer_prompt: Annotated[str, AfterValidator(answer_prompt)]
    find_prompt: Annotated[str, AfterValidator(validate_find_prompt)]