Invoice Documents
Starting with Aiviro version 3.28.0, we’ve introduced an improved invoice processing system that makes extracting data from invoices simpler and more powerful than ever.
Action
- class aiviro.actions.documents.ProcessInvoice(filepath: Path | str, customer_order_number_format: str = '', vendor_order_number_format: str = '')
Process invoice documents and extract structured data.
This action processes PDF invoice documents and returns a structured DocumentInvoiceV2 object containing comprehensive information including vendor details, customer data, line items, payment information, and more.
- Parameters:
filepath – Path to the invoice PDF file to process
customer_order_number_format – Format hint for the customer order number field (max 128 chars).
vendor_order_number_format – Format hint for the vendor order number field (max 128 chars).
- Returns:
DocumentInvoiceV2 object containing size unit, page-size, and the extracted invoice data
- Example:
>>> from aiviro.actions.documents import ProcessInvoice >>> from aiviro.core.utils.api_client.schemas.reader import InvoiceDataV2, DocumentInvoiceV2 >>> >>> robot = ... # e.g.: create_desktop_robot() >>> invoice_path = "path/to/invoice.pdf" >>> process_invoice = ProcessInvoice(filepath=invoice_path) >>> result = process_invoice(robot=robot) >>> >>> # access extracted data >>> inv_data = result.invoice_data >>> print(f"Vendor: {inv_data.vendor.name.value}") >>> print(f"Total Amount: {inv_data.primary_total.total_amount.value}")
Note
The legacy InvoiceReader is still available via Reader for backward compatibility.
Data Schemas
Below are the detailed data models that represent the structured information extracted from invoices by the Aiviro processing system. These models define all the fields and their types that you can access after processing an invoice.
Attribute |
Type |
Description |
|---|---|---|
customer |
Customer details including name, tax ID, address, etc. |
|
vendor |
Vendor details including name, tax ID, address, etc. |
|
shipping_address |
Shipping address information if different from customer address |
|
shipping_address_recipient |
Name associated with the shipping address |
|
invoice_id |
Unique identifier/number of the invoice |
|
invoice_date |
Date when the invoice was issued |
|
due_date |
Date payment for this invoice is due |
|
tax_date |
Date the tax was applied to the invoice |
|
order_number |
List of order reference numbers |
|
primary_total |
Main invoice totals in the primary currency (e.g., EUR for international transactions) |
|
secondary_total |
Additional invoice totals in a secondary currency (e.g., CZK for Czech invoices that also show amounts in local currency) |
|
exchange_rate |
Exchange rate between primary and secondary currency |
|
vat_rate |
VAT rate applied to the invoice |
|
variable_symbol |
Variable symbol of the invoice |
|
payment_term |
The terms of payment for the invoice |
|
bank_accounts |
List of bank accounts for payment |
|
items |
List of invoice line items, filtered by totals |
|
raw_items |
List of unfiltered invoice line items |
Attribute |
Type |
Description |
|---|---|---|
name |
Name of the entity (company or person) |
|
id |
Entity reference ID |
|
tax_id |
The taxpayer number (VAT ID) |
|
address |
Mailing address |
|
address_recipient |
Name associated with the address |
|
ico |
The Czech company ID number |
Attribute |
Type |
Description |
|---|---|---|
is_not_tax_payer |
Indicates if the vendor is not a tax payer |
|
Inherits all attributes from InvoiceEntity |
Attribute |
Type |
Description |
|---|---|---|
house_number |
House number |
|
road |
Street/road name |
|
city |
City name |
|
postal_code |
Postal code |
|
street_address |
Full street address (combined road and house number) |
|
country_code_A3 |
Country code in alpha-3 format (3 letters), e.g., CZE, DEU, AUT, SVK |
Attribute |
Type |
Description |
|---|---|---|
total_amount |
Total gross amount of the invoice |
|
total_amount_without_tax |
Total net amount of the invoice (without tax) |
|
total_tax |
Total tax amount of the invoice |
|
amount_due |
Amount due for payment |
|
currency |
Currency code (e.g., EUR, USD, CZK) |
Attribute |
Type |
Description |
|---|---|---|
iban |
IBAN of the bank account |
|
swift |
SWIFT/BIC code of the bank |
|
bank_name |
Name of the bank |
|
country_code |
Country code in alpha-2 format (e.g., CZ, DE, AT) |
|
local_account |
Local bank account details |
Attribute |
Type |
Description |
|---|---|---|
bank_code |
Bank code |
|
account_number |
Account number |
|
account_prefix |
Account prefix |
Attribute |
Type |
Description |
|---|---|---|
index |
Line item index |
|
content |
Full text of the line item |
|
description |
Description of the item |
|
quantity |
Quantity of the item |
|
unit_price |
Price per unit |
|
unit |
Unit of measurement (e.g., kg, pcs) |
|
product_code |
Product code/SKU |
|
amount |
Total gross amount for the line item |
|
amount_without_tax |
Total net amount for the line item |
|
amount_tax |
Tax amount for the line item |
|
tax_rate |
Tax rate percentage |
Simplified Data Access
The InvoiceDataV2 class provides a convenient dump_to_values() method that converts the entire object into a simple dictionary containing only the actual values, without the additional metadata from the FieldDataV2 wrapper:
# Convert to a plain dictionary with just the values
simple_data = result.dump_to_values()
# Access data directly with standard dictionary syntax
vendor_name = simple_data.get("vendor", {}).get("name")
total_amount = simple_data.get("primary_total", {}).get("total_amount")
print(f"Vendor Name: {vendor_name}")
print(f"Total Amount: {total_amount}")
Pydantic Models
- pydantic model aiviro.core.utils.api_client.schemas.reader.FieldDataV2
- pydantic model aiviro.core.utils.api_client.schemas.reader.InvoiceItemV2
- field amount: FieldDataV2[Decimal] | None = None
- field amount_tax: FieldDataV2[Decimal] | None = None
- field amount_without_tax: FieldDataV2[Decimal] | None = None
- field content: FieldDataV2[str] | None = None
- field description: FieldDataV2[str] | None = None
- field product_code: FieldDataV2[str] | None = None
- field quantity: FieldDataV2[Decimal] | None = None
- field tax_rate: FieldDataV2[Decimal] | None = None
- field unit: FieldDataV2[str] | None = None
- field unit_price: FieldDataV2[Decimal] | None = None
- pydantic model aiviro.core.utils.api_client.schemas.reader.LocalBankAccountV2
- field account_code: FieldDataV2[str] | None = None
- field account_prefix: FieldDataV2[str] | None = None
- field bank_code: FieldDataV2[str] | None = None
- pydantic model aiviro.core.utils.api_client.schemas.reader.InvoiceBankAccountV2
- field bank_name: FieldDataV2[str] | None = None
- field country_code: FieldDataV2[str] | None = None
- field iban: FieldDataV2[str] | None = None
- field local_account: LocalBankAccountV2 | None = None
- field swift: FieldDataV2[str] | None = None
- pydantic model aiviro.core.utils.api_client.schemas.reader.VatSubTotalsV2
Represents total amounts of the subsequent VAT rate
- field vat_rate: FieldDataV2[int] | None = None
- pydantic model aiviro.core.utils.api_client.schemas.reader.InvoiceTotalsV2
Represents total amounts of the invoice
- field amount_due: FieldDataV2[Decimal] | None = None
- pydantic model aiviro.core.utils.api_client.schemas.reader.InvoiceAddressV2
- pydantic model aiviro.core.utils.api_client.schemas.reader.InvoiceEntityV2
- field address: InvoiceAddressV2 | None = None
- field address_recipient: FieldDataV2[str] | None = None
- field ico: FieldDataV2[str] | None = None
- field id: FieldDataV2[str] | None = None
- field name: FieldDataV2[str] | None = None
- field tax_id: FieldDataV2[str] | None = None
- pydantic model aiviro.core.utils.api_client.schemas.reader.InvoiceVendorV2
- field is_not_tax_payer: FieldDataV2[bool] | None = None
- pydantic model aiviro.core.utils.api_client.schemas.reader.InvoiceDataV2
- field bank_accounts: list[InvoiceBankAccountV2] [Optional]
- field customer: InvoiceEntityV2 | None = None
- field customer_order_numbers: list[FieldDataV2[str]] [Optional]
- field due_date: FieldDataV2[date] | None = None
- field exchange_rate: FieldDataV2[Decimal] | None = None
- field invoice_date: FieldDataV2[date] | None = None
- field invoice_id: FieldDataV2[str] | None = None
- field items: list[InvoiceItemV2] [Optional]
- field order_number: list[FieldDataV2[str]] [Optional]
- field payment_term: FieldDataV2[str] | None = None
- field primary_total: InvoiceTotalsV2 | None = None
- field primary_vat_sub_totals: dict[str, VatSubTotalsV2] [Optional]
- field raw_items: list[InvoiceItemV2] [Optional]
- field reverse_charge: FieldDataV2[bool] | None = None
- field secondary_total: InvoiceTotalsV2 | None = None
- field secondary_vat_sub_totals: dict[str, VatSubTotalsV2] [Optional]
- field shipping_address: InvoiceAddressV2 | None = None
- field shipping_address_recipient: FieldDataV2[str] | None = None
- field tax_date: FieldDataV2[date] | None = None
- field total_amount_rounding: FieldDataV2[Decimal] | None = None
- field variable_symbol: FieldDataV2[str] | None = None
- field vat_rate: FieldDataV2[Decimal] | None = None
- field vendor: InvoiceVendorV2 | None = None
- field vendor_order_numbers: list[FieldDataV2[str]] [Optional]
- dump_to_values() dict[str, Any]
Creates a dictionary of basic data-type values without additional information of value_type or page_index.
- Returns:
Dictionary of attribute_name as a key and its simple value as value.
- Example:
>>> from aiviro.core.utils.api_client.schemas.reader import InvoiceDataV2, FieldDataV2 >>> inv_d = InvoiceDataV2( ... customer=InvoiceEntityV2(name=FieldDataV2(value="John Doe", value_type="str", page_index=0)), ... due_date=FieldDataV2(value=datetime.date(2022, 11, 11), value_type="datetime.date", page_index=0), ... invoice_id=FieldDataV2(value="INV-12345", value_type="str", page_index=0), ... order_number=[ ... FieldDataV2(value="12345", value_type="str", page_index=0), ... FieldDataV2(value="67890", value_type="str", page_index=0) ... ] ... ) >>> print(inv_d.dump_to_values()) ... #{ ... # 'customer': 'John Doe', ... # 'due_date': datetime.date(2022, 11, 11), ... # 'invoice_id': 'INV-12345', ... # 'order_number': ['12345', '67890'], ... #}