Data Formats Guide
Overview
ISO-DART saves all data in CSV format for maximum compatibility with analysis tools. This guide describes the structure and format of downloaded data files.
General CSV Format
All CSV files have:
Header row with column names
UTF-8 encoding
Comma-separated values
ISO-8601 date/time formats where applicable
File Naming Conventions
CAISO
Format: {start_date}_to_{end_date}_{query_name}_{data_item}.csv
Examples:
20240101_to_20240131_PRC_LMP_TH_NP15_GEN-APND.csv20240101_to_20240131_SLD_FCST_CAISO_forecast.csv20240101_to_20240131_ENE_WIND_SOLAR_SUMMARY_WIND_TOTAL_GEN_MW.csv
MISO
Format: miso_{data_type}_{date}.csv
Examples:
miso_da_exante_lmp_2024-01-01.csvmiso_fuel_mix_2024-01-01.csvmiso_rt_actual_load_2024-01-01.csv
NYISO
Format: {start_date}_to_{end_date}_{dataid}_{aggregation}.csv
Examples:
20240101_to_20240131_damlbmp_zone.csv20240101_to_20240131_rtfuelmix.csv20240101_to_20240131_pal.csv
SPP
Format: {start_date}_to_{end_date}_SPP_{market}_{data_type}_{location_type}.csv
Examples:
20240101_to_20240131_SPP_DA_LMP_SL.csv20240101_to_20240131_SPP_RTBM_MCP.csv20240101_to_20240131_SPP_Operating_Reserves.csv
BPA
Format: {year}_BPA_{data_type}.csv
Examples:
2024_BPA_Wind_Generation_Total_Load.csv2024_BPA_Reserves_Deployed.csv
PJM
Format: {start_date}_to_{end_date}_{endpoint}.csv
Examples:
01-01-2024_to_01-31-2024_da_hrl_lmps.csv01-01-2024_to_01-31-2024_hrl_load_metered.csv
Weather
Format: {start_date}_to_{end_date}_{station_name}_{state}.csv
Examples:
2024-01-01_to_2024-01-31_San_Francisco_International_Airport_CA.csv
Data Schemas
CAISO LMP Data
File Pattern: *_PRC_LMP_*.csv
Column |
Type |
Description |
Example |
|---|---|---|---|
|
datetime |
Interval start (GMT) |
|
|
datetime |
Interval end (GMT) |
|
|
date |
Operating date |
|
|
integer |
Hour ending (1-24) |
|
|
string |
Node ID (XML format) |
|
|
string |
Node ID |
|
|
string |
Node name |
|
|
string |
Market type |
|
|
string |
Data item identifier |
|
|
float |
LMP value ($/MWh) |
|
|
float |
Marginal Loss Component |
|
|
float |
Marginal Congestion Component |
|
Sample Data:
OPR_DATE,INTERVAL_NUM,DATA_ITEM,VALUE,MLC,MCC
2024-01-01,1,TH_NP15_GEN-APND,32.45,0.23,1.12
2024-01-01,2,TH_NP15_GEN-APND,29.87,-0.45,0.89
CAISO Load Forecast
File Pattern: *_SLD_FCST_*.csv
Column |
Type |
Description |
Example |
|---|---|---|---|
|
datetime |
Interval start |
|
|
date |
Operating date |
|
|
integer |
Hour ending |
|
|
string |
Area name |
|
|
string |
Market type |
|
|
float |
Load (MW) |
|
CAISO Wind/Solar Summary
File Pattern: *_ENE_WIND_SOLAR_SUMMARY_*.csv
Column |
Type |
Description |
Example |
|---|---|---|---|
|
datetime |
Interval start |
|
|
date |
Operating date |
|
|
integer |
5-min interval (1-288) |
|
|
string |
Data type |
|
|
float |
Generation (MW) |
|
Available DATA_ITEM values:
WIND_TOTAL_GEN_MW- Wind generationSOLAR_TOTAL_GEN_MW- Solar generationWIND_FORECAST_MW- Wind forecastSOLAR_FORECAST_MW- Solar forecast
MISO LMP Data
File Pattern: miso_*_lmp_*.csv
Column |
Type |
Description |
Example |
|---|---|---|---|
|
datetime |
Timestamp |
|
|
string |
Pricing node |
|
|
float |
Total LMP ($/MWh) |
|
|
float |
Congestion component |
|
|
float |
Loss component |
|
|
float |
LMP value |
|
|
string |
Data status |
|
MISO Fuel Mix
File Pattern: miso_fuel_mix_*.csv
Column |
Type |
Description |
Example |
|---|---|---|---|
|
datetime |
5-min timestamp |
|
|
string |
Fuel type |
|
|
string |
MISO region |
|
|
float |
Generation (MW) |
|
Fuel Types: Coal, Gas, Nuclear, Wind, Solar, Hydro, Other
NYISO LBMP Data
File Pattern: *_damlbmp_zone.csv or *_realtime_zone.csv
Column |
Type |
Description |
Example |
|---|---|---|---|
|
datetime |
Timestamp (ET) |
|
|
string |
Zone name |
|
|
float |
Total LBMP |
|
|
float |
Loss component |
|
|
float |
Congestion component |
|
Zone Names: CAPITL, CENTRL, DUNWOD, GENESE, H.Q, HUD VL, LONGIL, MHK VL, MILLWD, N.Y.C., NORTH, O.H., PJM, WEST
NYISO Fuel Mix
File Pattern: *_rtfuelmix.csv
Column |
Type |
Description |
Example |
|---|---|---|---|
|
datetime |
5-min timestamp |
|
|
string |
Time zone |
|
|
float |
Generation (MW) |
|
|
float |
Generation (MW) |
|
|
float |
Generation (MW) |
|
|
float |
Generation (MW) |
|
|
float |
Generation (MW) |
|
|
float |
Generation (MW) |
|
|
float |
Generation (MW) |
|
SPP LMP Data
File Pattern: *_SPP_*_LMP_*.csv
Column |
Type |
Description |
Example |
|---|---|---|---|
|
datetime |
GMT interval end |
|
|
string |
Location name |
|
|
string |
Physical node |
|
|
float |
Total LMP ($/MWh) |
|
|
float |
Loss component |
|
|
float |
Congestion component |
|
|
float |
Energy component |
|
Working with Data
Loading in Python
import pandas as pd
# CAISO LMP
df = pd.read_csv('data/CAISO/20240101_to_20240131_PRC_LMP_TH_NP15_GEN-APND.csv')
df['OPR_DATE'] = pd.to_datetime(df['OPR_DATE'])
# MISO with datetime index
df = pd.read_csv('data/MISO/miso_da_exante_lmp_2024-01-01.csv')
df['interval'] = pd.to_datetime(df['interval'])
df.set_index('interval', inplace=True)
# Weather data
df = pd.read_csv('data/weather/2024-01-01_to_2024-01-31_San_Francisco_CA.csv',
index_col='time', parse_dates=True)
Loading in R
# CAISO LMP
library(readr)
df <- read_csv('data/CAISO/20240101_to_20240131_PRC_LMP_TH_NP15_GEN-APND.csv')
df$OPR_DATE <- as.Date(df$OPR_DATE)
# MISO
df <- read_csv('data/MISO/miso_da_exante_lmp_2024-01-01.csv')
df$interval <- as.POSIXct(df$interval)
Excel Import
Open Excel
Data → From Text/CSV
Select file
Verify delimiter is comma
Import
Database Import
-- PostgreSQL
COPY caiso_lmp FROM '/path/to/file.csv' DELIMITER ',' CSV HEADER;
-- SQLite
.mode csv
.import data/CAISO/file.csv caiso_lmp
Data Quality Checks
Validate Date Ranges
import pandas as pd
df = pd.read_csv('your_file.csv')
df['OPR_DATE'] = pd.to_datetime(df['OPR_DATE'])
# Check for gaps
date_range = pd.date_range(df['OPR_DATE'].min(), df['OPR_DATE'].max(), freq='D')
missing_dates = set(date_range) - set(df['OPR_DATE'])
if missing_dates:
print(f"Missing dates: {missing_dates}")
Check for Nulls
null_counts = df.isnull().sum()
print(null_counts[null_counts > 0])
Validate Intervals
# Check hourly data has 24 intervals per day
intervals_per_day = df.groupby('OPR_DATE')['INTERVAL_NUM'].nunique()
irregular_days = intervals_per_day[intervals_per_day != 24]
if len(irregular_days) > 0:
print(f"Irregular days: {irregular_days}")
Time Zones
CAISO
All times in Pacific Time (PT)
GMT columns provided for UTC reference
Handles DST automatically (23 or 25 hours)
MISO
All times in Central Time (CT)
ISO 8601 format with UTC offset
NYISO
All times in Eastern Time (ET)
Time zone column included
SPP
GMT timestamps
Central Time implicit
PJM
Eastern Prevailing Time (EPT)
Includes DST handling
Next Steps
See Python API Guide for data processing examples
See Analysis Examples for analysis examples