Blockchain beyond the basics

Blockchain is a peer-to-peer network unlike server-client network model. It is a distributed ledger database. Participants in the network are miners and they are the one who will give consensus of those transactions.

We need to understand some terms to learn blockchain at ease. Public and private key are used for encryption to make secure communication. You have the private key and others have your public key. The communication between you and others are now secure. You have to keep private key secret and guard it well. You will encrypt message by using private key and others use your public key to decrypt it to read it. When others want to send message to you, they use your public key to encrypt message and send to you. You are the only one who can decrypt it with your private key.

Only public-private key can encrypt and decrypte messages. This is the heart of block chain.

Nouce is a number that design to use once to prevent duplication of a unique ID.

Hash is a function to convert any string to the same size. Hashing goes for only one direction. Hash is unique. Blockchain store Hash in the database instead of storing real data to save disk space. Hash in blockchain use SHA-256 to generate address and some other information to be used in the chain.

Mining is like mining gold because it is difficult to mine. Miners use their computer including power and processing to solve really difficult math problems. That is hashing to get a desire value in the hash such as hash starting with number 40. Miners is hashing and compare the results until they get it match the desire result.

Example of info to be used in block chain

Miners need a proof of work to be able to add to the block chain. Information above will be hashed before adding the to chain.

Block Chain Basic

The way we used internet today is a big deal and continue to expand along with mobile usage. We can access any information instantly any time anywhere. However, internet face very challenging issue that is ‘trust‘.

How do we trust the other person on the internet? Is it really that person? We have several mechanisms such as 2 way authentication, firewall, and etc to increase trust on internet but we still get hack today.

Blockchain was burn to solve ‘Trust‘ problem on the internet and it opens a lot of opportunities for us to work in blockchain eco-system.

We store data today in a database such as contact information of our customer; name, email, phone. They are stored in structure table (rows and columns). This is powerful because we can recall information very quickly later from the database and it can prevent duplication. These days, there is a new way of storing data in unstructured such as social media posts, music, photo, and etc. Both structure and unstructured are now popular today to store our content on the internet.

Security on the database is controlled by access user which we are able to control who is able to access and query the data. Some users are able to read only and some are able to write to database. It blocks unauthorized user. This is how database works today but it has some limitations. What if someone is able to access the database without authorizing called ‘hacking‘. Block chain was born to solve this issue. It is challenging our current traditional database storing approach. Traditional database, central database control who is able to control, it has single point of failure if it down. If have central authority authorize or preventing users from accessing the database. Comparing central authority with business, we have a banker who provide our bank account, lawyer who check our signature, and etc.

On the other hand block chain, it works differently from traditional database. It calls ‘Distributed database‘ meaning all information does not store on a central location but it stores at all users who use the data. In block chain world, only the owner of the record is able to change information. It needs to send the changes to block chain (all users who has the database) to get a ‘Consensus‘ that this is the owner. If it’s ok, the changes are added as a new block in the block chain. We can think about block chain as a distributed ledger which will append forever when new changes are required in the network.

When block chain works like this, it is difficult for hackers to hack or change information because he/she has to go and change on all database around the world which is not possible.

This eliminates central authority from the traditional model which Bitcoin follow this model. Bitcoin was created in 2009 (or before). It enables digital currency to be used without the need for the banks to confirm the trust. It relies on everyone in the network to give consensus and approve money transfer transactions. Only 21 M bitcoins in the Bitcoin network. Miners are some body who open their computers to connect to Bitcoin network and solve difficult mathematics to gain some coins. 2040 is expected to be the last coin mine. If we do not want to be a miner, we can buy bitcoin in the market. When we buy it, it will go to the network to get consensus from all users in the chain. Miners who solve match work called ‘Proof of Work“. Wide acceptance is the foundation of any coin.

New opportunities of block chain

We can use it to sell digital assets such as music, photo, video. We can prevent illegal selling because we can see how the assets are transferred in the chain. Another example is paper work process. We can add a digital signature by using block chain to assure nobody modify the paper without everyone knowing. This is the ‘Trust‘ that we want. The 3rd example is to store our personal identity in the block chain, then we can remove authorities because everyone can check whether this person is real or fake by checking block chain. However, it is still early and many companies are experimenting it.

Real case study

A startup company Everledger store diamond information in block chain to ensure that the diamonds are genuine.

DAO raised money $168 M from concurrency.

Voting can be applied by using block chain to capture vote online in Columbia.

Smart Contract

It’s a code which run on the block chain. It’s instructions that will be executed in block chain. All business logic will be embedded here. Ethereum is a project to use Smart Contract concept to build block chain framework.

Challenges

Knowledge in block chain has little knowledge of it and many do not understand it potential. Block chain is new and needs time before it matures. Integrations will need to have a set of standard to make it work in existing systems. ISO is looking into it. Transactions require math calculations and need computation power. Complex to program. Ethereum or Slock.it are examples of programming. Regulations and laws are not ready in many countries. It challenges existing players such as bankers.

Block chain is a bigger thing in the future because adoptions are increasing and it’s gonna be like the start and boom of internet in 2000.

TA-Lib for quant trading

There are many libraries which quant need to use to generate trading signals. TA-Lib is a popular one which can calculate different technical indicators for us without implementing it by our-self.

Steps-by-steps

wget http://prdownloads.sourceforge.net/ta-lib/ta-lib-0.4.0-src.tar.gz

tar -xzf ta-lib-0.4.0-src.tar.gz

cd ta-lib/

sudo ./configure

sudo make

sudo make install

pip install ta-lib

#or pip3 install ta-lib

Sometimes you get memory error because your VM has low memory. You can fix it by adding a swap memory. This is an example on Ubuntu.

Check Swap Memory

sudo swapon -s

free -m

Check available Disk Space

df -h

Create Swap File (equal to your RAM)

sudo fallocate -l 4G /swapfile

ls -lh /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile
sudo swapon -s
free -m

Make Swap permanent after reboot

sudo nano /etc/fstab

Add to the bottom…
/swapfile none swap sw 0 0

If you get an error like below:
ImportError: libta_lib.so.0: cannot open shared object file: No such file or directory

export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH

Build app with process builder

Salesforce allows us to create an app with point-and-click model. The beauty of Salesforce platform is that it has ‘Process Builder’ which allow us to automate process flows every easily without coding. Below is an example of the app objects that we are going to build.

We will build an app which can create a invoice on Salesforce. For simplicity, we do not focus on Invoice Line Item for this post.

It starts from Account -> Payment -> Transaction -> Invoice.

First, we will create Invoice object, and we specify auto number with a good format.

Make sure you select “Lunch New Customer Tab…”

Next, and next until finish and create new fields according to object schema above.

Repeat the same for other objects.

Add new relationship fields…

Start Quant Trading with IB Account

To start algo trading, it is recommended that you open an account with Interactive Broker (IB) or at least get a demo account to access to real-time market data and historical data.

Then, download 3 programs below to be ready for algo coding.

First Setup IB Gateway

Before you can get market data or sending orders. First of all you need to log-in to IB Gateway application and make some changes on the configuration.

Select “IB API” option and enter your credential. You can select “Live Trading” or “Paper Trading” for real-trade or simulate-trade respectively.

After log-in, you should see the green status “Connected“. Then, Click “Configuration” menu and go to API section.

Make sure you un-check “Read-only API” to allow Python to send orders to IB and change port to “7496” for IBridgePy to work correctly.

Create Non-US data bundle for Zipline

I will create a custom data (Non-US data) bundle for Zipline. In this case, I will create a data bundle for Thai Stock Market.

Here are the steps :

  1. Get CSV files from Yahoo Finance which normally in CSV format.
  2. Create a custom bundle support module called “viacsv“. You can name anything.
  3. Make Zipline aware of our new bundle by registering it via .zipline/extension.py
  4. Create the bundle
  5. Test our bundle with Zipline

 

STEP 1 – download Yahoo data

Here is the example of a csv file. The file format will look like this with several columns and rows. In this case, I downloaded ADVANC which is a big cap stock in Thailand. File name is ‘ADVANC.BK.csv’

Date Open High Low Close Adj Close Volume
1/4/2000 44.599998 46 43 43.400002 15.736162 1039000
1/5/2000 38.200001 41 38 40.599998 14.720927 2624000

 

STEP 2 – register ‘viacsv’ module to support local CSV files

Zipline installation path in Linux is:

/usr/local/lib/python2.7/dist-packages/zipline

we need to create ‘viacsv.py’ file in the path below.

/usr/local/lib/python2.7/dist-packages/zipline/data/bundles/viacsv.py

The file looks like this and you have to edit the path to your file location. In this case, I use ‘/home/node/stockdata/‘.

if you want less log messages, please update the line:

boDebug=False # Set False to get less log messages

#
# Ingest stock csv files to create a zipline data bundle

import os

import numpy as np
import pandas as pd
import datetime

boDebug=True # Set True to get trace messages

from zipline.utils.cli import maybe_show_progress

def viacsv(symbols,start=None,end=None):

# strict this in memory so that we can reiterate over it.
 # (Because it could be a generator and they live only once)
 tuSymbols = tuple(symbols)

if boDebug:
 print "entering viacsv. tuSymbols=",tuSymbols

# Define our custom ingest function
 def ingest(environ,
 asset_db_writer,
 minute_bar_writer, # unused
 daily_bar_writer,
 adjustment_writer,
 calendar,
 cache,
 show_progress,
 output_dir,
 # pass these as defaults to make them 'nonlocal' in py2
 start=start,
 end=end):

if boDebug:
 print "entering ingest and creating blank dfMetadata"

dfMetadata = pd.DataFrame(np.empty(len(tuSymbols), dtype=[
 ('start_date', 'datetime64[ns]'),
 ('end_date', 'datetime64[ns]'),
 ('auto_close_date', 'datetime64[ns]'),
 ('symbol', 'object'),
 ]))

if boDebug:
 print "dfMetadata",type(dfMetadata)
 print dfMetadata.describe
 print

# We need to feed something that is iterable - like a list or a generator -
 # that is a tuple with an integer for sid and a DataFrame for the data to
 # daily_bar_writer

liData=[]
 iSid=0
 for S in tuSymbols:
 IFIL="/home/node/stockdata/"+S
 if boDebug:
 print "S=",S,"IFIL=",IFIL
 dfData=pd.read_csv(IFIL,index_col='Date',parse_dates=True).sort_index()
 if boDebug:
 print "read_csv dfData",type(dfData),"length",len(dfData)
 print
 dfData.rename(
 columns={
 'Open': 'open',
 'High': 'high',
 'Low': 'low',
 'Close': 'close',
 'Volume': 'volume',
 'Adj Close': 'price',
 },
 inplace=True,
 )
 dfData['volume']=dfData['volume']/1000
 liData.append((iSid,dfData))

# the start date is the date of the first trade and
 start_date = dfData.index[0]
 if boDebug:
 print "start_date",type(start_date),start_date

# the end date is the date of the last trade
 end_date = dfData.index[-1]
 if boDebug:
 print "end_date",type(end_date),end_date

# The auto_close date is the day after the last trade.
 ac_date = end_date + pd.Timedelta(days=1)
 if boDebug:
 print "ac_date",type(ac_date),ac_date

# Update our meta data
 dfMetadata.iloc[iSid] = start_date, end_date, ac_date, S

iSid += 1

if boDebug:
 print "liData",type(liData),"length",len(liData)
 print liData
 print
 print "Now calling daily_bar_writer"

daily_bar_writer.write(liData, show_progress=False)

# Hardcode the exchange to "YAHOO" for all assets and (elsewhere)
 # register "YAHOO" to resolve to the NYSE calendar, because these are
 # all equities and thus can use the NYSE calendar.
 dfMetadata['exchange'] = "YAHOO"

if boDebug:
 print "returned from daily_bar_writer"
 print "calling asset_db_writer"
 print "dfMetadata",type(dfMetadata)
 print dfMetadata
 print

# Not sure why symbol_map is needed
 symbol_map = pd.Series(dfMetadata.symbol.index, dfMetadata.symbol)
 if boDebug:
 print "symbol_map",type(symbol_map)
 print symbol_map
 print

asset_db_writer.write(equities=dfMetadata)

if boDebug:
 print "returned from asset_db_writer"
 print "calling adjustment_writer"

adjustment_writer.write()

if boDebug:
 print "returned from adjustment_writer"
 print "now leaving ingest function"

if boDebug:
 print "about to return ingest function"
 return ingest

Do you worry much about the code above. As long as you edit the file path, it should work correctly.

 

STEP 3- Make zipline aware of ‘viacsv’ module

Now move to your home directory and create ‘.zipline’ folder if you use Linux. In this case, I use /home/toro/.zipline.

from zipline.data.bundles import register 
from zipline.data.bundles.viacsv import viacsv 
from zipline.utils.calendars import get_calendar 
from zipline.utils.calendars import exchange_calendar_lse

eqSym = { 
 "ADVANC", 
}

register( 
 'csv', # name this whatever you like 
 viacsv(eqSym), 
 calendar_name='LSE', 
)

 

STEP 4- create bundle

zipline ingest -b csv

If you got an error, it is highly possible that you are using incorrect trading calendar which does not match your CSV data. In my case, I downloaded ADVANC stock data which is Thai stock market trading calendar which is different from US one. So, we have to modify our trading calendar when we try to create the bundle.

This is the calendar file location that you need to modify. Don’t forget to backup it before you modify.

/usr/local/lib/python2.7/dist-packages/zipline/utils/calendars/exchange_calendar_lse.py

 

Back-Testing Non-US data with Zipline

I created this post to share how we can use Zipline to back-test non-us data. Zipline is designed by a VC company called Quantopian and they open the source code for retail traders to use for stock back testing. However, it is only support US market data. Fortunately, there are some things we can do to make it works with Non-US data.

I am going to make Zipline works with Thai Stock data because I am a professional investors in Thailand and want Zipline to be my main tools to check my trading strategies whether or not it sounds for Thailand stock market.

I assume that you are familiar with Python and Zipline and how to install packages by using command lines. If not, please check my other posts to see how we can setup Zipline.

Here is the steps:

1) Downloaded data from yahoo into a csv file
2) Implemented the steps to ingest custom data from this link:
3) Ran the ingest comment using the LSE calendar (learn more about LSE calendar on this link)

from zipline.data.bundles import register 
from zipline.data.bundles.viacsv import viacsv 
from zipline.utils.calendars import get_calendar 
from zipline.utils.calendars import exchange_calendar_lse

eqSym = { 
 "CPI", 
}

register( 
 'csv2', # name this whatever you like 
 viacsv(eqSym), 
 calendar_name='LSE', 
)

4) Implemented the following code – and so far – bundle_data.equity_daily_bar_reader.trading_calendar.all_sessions has returned a UK looking calendar

bundle_data = load('csv2', os.environ, None) 
 cal = bundle_data.equity_daily_bar_reader.trading_calendar.all_sessions 
 pipeline_loader = USEquityPricingLoader(bundle_data.equity_daily_bar_reader, bundle_data.adjustment_reader) 
 choose_loader = make_choose_loader(pipeline_loader) 
 env = TradingEnvironment(bm_symbol='^FTSE', 
 exchange_tz='Europe/London',asset_db_path=parse_sqlite_connstr(bundle_data.asset_finder.engine.url))

data = DataPortal( 
 env.asset_finder, get_calendar("LSE"), 
 first_trading_day=bundle_data.equity_minute_bar_reader.first_trading_day, 
 equity_minute_reader=bundle_data.equity_minute_bar_reader, 
 equity_daily_reader=bundle_data.equity_daily_bar_reader, 
 adjustment_reader=bundle_data.adjustment_reader, 
 )

sadfasf