Parse Statement

Upload a bank statement PDF and receive structured JSON data.

POST/v1/parse

Parse a bank statement PDF into structured JSON

Request

Send a multipart/form-data request with the PDF file:

ParameterTypeDescription
file*fileBank statement PDF. Max 25MB.
formatstringResponse format: "json" (default) or "csv". CSV returns a downloadable file.
cURL (JSON)
curl -X POST https://api.statementparse.dev/v1/parse \
  -H "Authorization: Bearer sp_live_your_key_here" \
  -F "file=@chase-january-2025.pdf"
cURL (CSV)
curl -X POST "https://api.statementparse.dev/v1/parse?format=csv" \
  -H "Authorization: Bearer sp_live_your_key_here" \
  -F "file=@chase-january-2025.pdf" \
  -o transactions.csv
Python
import requests

response = requests.post(
    "https://api.statementparse.dev/v1/parse",
    headers={"Authorization": "Bearer sp_live_your_key_here"},
    files={"file": open("statement.pdf", "rb")},
)
data = response.json()
print(f"Bank: {data['statement']['bank_name']}")
print(f"Transactions: {len(data['transactions'])}")

# Print categorized transactions
for txn in data["transactions"]:
    print(f"  {txn['date']}  {txn['category'] or 'uncategorized':15s}  {txn['amount']:>10.2f}")
JavaScript (Node.js)
const form = new FormData();
form.append("file", fs.createReadStream("statement.pdf"));

const response = await fetch(
  "https://api.statementparse.dev/v1/parse",
  {
    method: "POST",
    headers: { Authorization: "Bearer sp_live_your_key_here" },
    body: form,
  }
);
const data = await response.json();
console.log(`Bank: ${data.statement.bank_name}`);
console.log(`Transactions: ${data.transactions.length}`);

Response

A successful response includes the detected bank, account metadata, all transactions with categories and confidence scores, and a summary:

200 OK
{
  "id": "parse_abc123",
  "status": "success",
  "confidence": 0.95,
  "processing_time_ms": 1250,
  "statement": {
    "bank_name": "JPMorgan Chase",
    "bank_id": "chase",
    "account_holder": "John Smith",
    "account_number_last4": "4567",
    "account_type": "checking",
    "statement_period": {
      "start": "2025-01-01",
      "end": "2025-01-31"
    },
    "opening_balance": 5234.56,
    "closing_balance": 7283.81,
    "currency": "USD"
  },
  "summary": {
    "total_deposits": 5200.00,
    "total_withdrawals": -3150.75,
    "total_transactions": 24,
    "net_change": 2049.25
  },
  "transactions": [
    {
      "date": "2025-01-02",
      "description": "DIRECT DEPOSIT ACME CORP",
      "raw_description": "DIRECT DEP ACME CORP PAYROLL",
      "amount": 3250.00,
      "type": "credit",
      "balance_after": 8484.56,
      "category": "payroll",
      "field_confidence": {
        "date": 1.0,
        "amount": 1.0,
        "description": 1.0,
        "balance": 1.0
      }
    },
    {
      "date": "2025-01-05",
      "description": "AMAZON MARKETPLACE",
      "raw_description": "AMZN Mktp US*AB1CD2EF3",
      "amount": -47.99,
      "type": "debit",
      "balance_after": 8436.57,
      "category": "shopping",
      "field_confidence": {
        "date": 1.0,
        "amount": 1.0,
        "description": 1.0,
        "balance": 1.0
      }
    }
  ],
  "warnings": [],
  "pages_processed": 3,
  "parser_version": "0.1.0"
}

Response Fields

ParameterTypeDescription
idstringUnique parse request ID
statusstring"success" or "error"
confidencefloatBank detection confidence (0.0 - 1.0)
processing_time_msintegerTotal processing time in milliseconds
statementobjectBank and account metadata
summaryobjectAggregate totals for all transactions
transactionsarrayList of parsed transactions
warningsarrayNon-fatal issues encountered during parsing
pages_processedintegerNumber of PDF pages processed
parser_versionstringVersion of the parser used

Transaction Fields

ParameterTypeDescription
datestringISO 8601 date (YYYY-MM-DD)
descriptionstringCleaned transaction description
raw_descriptionstringOriginal text from the PDF
amountfloatPositive = credit, negative = debit
typestring"credit" or "debit"
balance_afterfloat | nullRunning balance after this transaction
categorystring | nullAuto-detected category (see list below)
field_confidenceobject | nullPer-field confidence scores (0.0 - 1.0)
i

Transaction amounts

Positive amounts are credits (deposits). Negative amounts are debits (withdrawals). The type field is always "credit" or "debit" for convenience.

Categories

Transactions are automatically categorized using rule-based keyword matching. The category field will be one of:

payrollDirect deposits, salary
rentRent, lease payments
utilitiesElectric, gas, water, internet
groceriesGrocery stores
diningRestaurants, food delivery
shoppingAmazon, Target, retail
transportationUber, gas stations, parking
subscriptionsNetflix, Spotify, SaaS
insuranceGEICO, State Farm, etc.
healthcarePharmacy, hospital, dental
transferZelle, Venmo, PayPal
feesOverdraft, monthly fees
atmATM withdrawals/deposits
loanMortgage, student loans

Returns null if no category matches.

Field Confidence

Each transaction includes per-field confidence scores indicating extraction reliability:

ParameterTypeDescription
datefloat1.0 for text extraction, 0.5 for OCR
amountfloat1.0 if balance reconciles, 0.8 without verification, 0.5 for OCR
descriptionfloat1.0 for text extraction, 0.6 for OCR
balancefloat1.0 if reconciles, 0.5 standalone, 0.0 if missing

CSV Output

Add ?format=csv to get transactions as a downloadable CSV file. The CSV includes metadata as comment lines and columns: date, description, raw_description, amount, type, balance_after, category.

CSV Response
# StatementParse Export
# Bank: JPMorgan Chase
# Account: ****4567
# Period: 2025-01-01 to 2025-01-31
# Opening Balance: 5234.56
# Closing Balance: 7283.81
# Total Transactions: 24
#
date,description,raw_description,amount,type,balance_after,category
2025-01-02,DIRECT DEPOSIT ACME CORP,DIRECT DEP ACME CORP PAYROLL,3250.0,credit,8484.56,payroll
2025-01-05,AMAZON MARKETPLACE,AMZN Mktp US*AB1CD2EF3,-47.99,debit,8436.57,shopping

Errors

400 Bad Request
{
  "error": "invalid_file",
  "message": "Expected a PDF file, got image/png",
  "code": 400
}
401 Unauthorized
{
  "error": "unauthorized",
  "message": "Missing or invalid API key",
  "code": 401
}
429 Too Many Requests
{
  "error": "rate_limit",
  "message": "Rate limit exceeded. Try again in 60 seconds.",
  "code": 429
}