Create Non-US data bundle for Zipline

I will create a custom data (Non-US data) bundle for Zipline. In this case, I will create a data bundle for Thai Stock Market.

Here are the steps :

  1. Get CSV files from Yahoo Finance which normally in CSV format.
  2. Create a custom bundle support module called “viacsv“. You can name anything.
  3. Make Zipline aware of our new bundle by registering it via .zipline/
  4. Create the bundle
  5. Test our bundle with Zipline


STEP 1 – download Yahoo data

Here is the example of a csv file. The file format will look like this with several columns and rows. In this case, I downloaded ADVANC which is a big cap stock in Thailand. File name is ‘ADVANC.BK.csv’

Date Open High Low Close Adj Close Volume
1/4/2000 44.599998 46 43 43.400002 15.736162 1039000
1/5/2000 38.200001 41 38 40.599998 14.720927 2624000


STEP 2 – register ‘viacsv’ module to support local CSV files

Zipline installation path in Linux is:


we need to create ‘’ file in the path below.


The file looks like this and you have to edit the path to your file location. In this case, I use ‘/home/node/stockdata/‘.

if you want less log messages, please update the line:

boDebug=False # Set False to get less log messages

# Ingest stock csv files to create a zipline data bundle

import os

import numpy as np
import pandas as pd
import datetime

boDebug=True # Set True to get trace messages

from zipline.utils.cli import maybe_show_progress

def viacsv(symbols,start=None,end=None):

# strict this in memory so that we can reiterate over it.
 # (Because it could be a generator and they live only once)
 tuSymbols = tuple(symbols)

if boDebug:
 print "entering viacsv. tuSymbols=",tuSymbols

# Define our custom ingest function
 def ingest(environ,
 minute_bar_writer, # unused
 # pass these as defaults to make them 'nonlocal' in py2

if boDebug:
 print "entering ingest and creating blank dfMetadata"

dfMetadata = pd.DataFrame(np.empty(len(tuSymbols), dtype=[
 ('start_date', 'datetime64[ns]'),
 ('end_date', 'datetime64[ns]'),
 ('auto_close_date', 'datetime64[ns]'),
 ('symbol', 'object'),

if boDebug:
 print "dfMetadata",type(dfMetadata)
 print dfMetadata.describe

# We need to feed something that is iterable - like a list or a generator -
 # that is a tuple with an integer for sid and a DataFrame for the data to
 # daily_bar_writer

 for S in tuSymbols:
 if boDebug:
 print "S=",S,"IFIL=",IFIL
 if boDebug:
 print "read_csv dfData",type(dfData),"length",len(dfData)
 'Open': 'open',
 'High': 'high',
 'Low': 'low',
 'Close': 'close',
 'Volume': 'volume',
 'Adj Close': 'price',

# the start date is the date of the first trade and
 start_date = dfData.index[0]
 if boDebug:
 print "start_date",type(start_date),start_date

# the end date is the date of the last trade
 end_date = dfData.index[-1]
 if boDebug:
 print "end_date",type(end_date),end_date

# The auto_close date is the day after the last trade.
 ac_date = end_date + pd.Timedelta(days=1)
 if boDebug:
 print "ac_date",type(ac_date),ac_date

# Update our meta data
 dfMetadata.iloc[iSid] = start_date, end_date, ac_date, S

iSid += 1

if boDebug:
 print "liData",type(liData),"length",len(liData)
 print liData
 print "Now calling daily_bar_writer"

daily_bar_writer.write(liData, show_progress=False)

# Hardcode the exchange to "YAHOO" for all assets and (elsewhere)
 # register "YAHOO" to resolve to the NYSE calendar, because these are
 # all equities and thus can use the NYSE calendar.
 dfMetadata['exchange'] = "YAHOO"

if boDebug:
 print "returned from daily_bar_writer"
 print "calling asset_db_writer"
 print "dfMetadata",type(dfMetadata)
 print dfMetadata

# Not sure why symbol_map is needed
 symbol_map = pd.Series(dfMetadata.symbol.index, dfMetadata.symbol)
 if boDebug:
 print "symbol_map",type(symbol_map)
 print symbol_map


if boDebug:
 print "returned from asset_db_writer"
 print "calling adjustment_writer"


if boDebug:
 print "returned from adjustment_writer"
 print "now leaving ingest function"

if boDebug:
 print "about to return ingest function"
 return ingest

Do you worry much about the code above. As long as you edit the file path, it should work correctly.


STEP 3- Make zipline aware of ‘viacsv’ module

Now move to your home directory and create ‘.zipline’ folder if you use Linux. In this case, I use /home/toro/.zipline.

from import register 
from import viacsv 
from zipline.utils.calendars import get_calendar 
from zipline.utils.calendars import exchange_calendar_lse

eqSym = { 

 'csv', # name this whatever you like 


STEP 4- create bundle

zipline ingest -b csv

If you got an error, it is highly possible that you are using incorrect trading calendar which does not match your CSV data. In my case, I downloaded ADVANC stock data which is Thai stock market trading calendar which is different from US one. So, we have to modify our trading calendar when we try to create the bundle.

This is the calendar file location that you need to modify. Don’t forget to backup it before you modify.



Back-Testing Non-US data with Zipline

I created this post to share how we can use Zipline to back-test non-us data. Zipline is designed by a VC company called Quantopian and they open the source code for retail traders to use for stock back testing. However, it is only support US market data. Fortunately, there are some things we can do to make it works with Non-US data.

I am going to make Zipline works with Thai Stock data because I am a professional investors in Thailand and want Zipline to be my main tools to check my trading strategies whether or not it sounds for Thailand stock market.

I assume that you are familiar with Python and Zipline and how to install packages by using command lines. If not, please check my other posts to see how we can setup Zipline.

Here is the steps:

1) Downloaded data from yahoo into a csv file
2) Implemented the steps to ingest custom data from this link:
3) Ran the ingest comment using the LSE calendar (learn more about LSE calendar on this link)

from import register 
from import viacsv 
from zipline.utils.calendars import get_calendar 
from zipline.utils.calendars import exchange_calendar_lse

eqSym = { 

 'csv2', # name this whatever you like 

4) Implemented the following code – and so far – bundle_data.equity_daily_bar_reader.trading_calendar.all_sessions has returned a UK looking calendar

bundle_data = load('csv2', os.environ, None) 
 cal = bundle_data.equity_daily_bar_reader.trading_calendar.all_sessions 
 pipeline_loader = USEquityPricingLoader(bundle_data.equity_daily_bar_reader, bundle_data.adjustment_reader) 
 choose_loader = make_choose_loader(pipeline_loader) 
 env = TradingEnvironment(bm_symbol='^FTSE', 

data = DataPortal( 
 env.asset_finder, get_calendar("LSE"),