Building Personal Wealth Advisers with Bank APIs – Hands-On Artificial Intelligence for Banking

Building Personal Wealth Advisers with Bank APIs

In the previous chapter, we analyzed the behavior of a sell-side of the exchange. We also learned about sentiment analysis and gained in-depth knowledge of the subject by learning how to analyze market needs using sentiment analysis. We then learned a bit about Neo4j, which is a NoSQL database technique. We then used Neo4j to build and store a network of entities involved in security trading.

In this chapter, we will focus on consumer banking and understand the needs of managing customer's digital data. Then, we will learn how to access the Open Bank Project, an open source platform for open banking. After that, we'll look at an example of wrapping AI models around bank APIs. Finally, we will learn about document layout analysis.

We will cover the following topics in this chapter:

  • Managing customer's digital data
  • The Open Bank Project
  • Performing document layout analysis
  • Cash flow projection using the Open Bank API
  • Using invoice entity recognition to track daily expenses

Let's get started!

Managing customer's digital data

In this era of digitization, there is no reason that money cannot be 100% transparent or that money transfers can't happen in real time, 24/7. Consumers have the right to their data as it represents their identity. Whether it is possible to or not, we should be consolidating our own data – realistically, this should be happening today and in the coming few years. It is best to consolidate our banking data in one place; for example, our frequent flyer mileage. The key point is that there shall be two tiers of data architecture – one for consolidation (including storage) and another for running the artificial intelligence services that will be used to analyze the data through the use of a smart device, also known as mobile applications. It can be painful to design an AI algorithm without understanding what is going on at the data consolidation layer.

Here, our data source could be identity data, bio/psychometric data, financial data, events that could impact any of this static data, and social data, which represents our relationship with others (including humans, objects, living organisms, and so on). This is, in fact, very similar to a business-to-business (B2B) setting, where any corporation could be represented by its legal identity, shareholders/ownership structures, financial health, events, as well as its commercial relationship, as outlined in Chapter 7, Sensing Market Sentiment for Algorithmic Marketing at Sell-Side. This also means that what we are learning in this chapter can help with your understanding of the previous chapters in this book.

However, for all individuals, including your, our financial needs are quite basic—they include payment, credit, and wealth. These spell out the core activities of financial services. Insurance is included as part of wealth as it aims to protect our wealth against undesirable events and risks—it's like the derivatives that hedge risked on procurement costs in Chapter 2, Time Series Analysis.

However, I am also of the opinion that the data that's derived from consumers is also owned by the bank processing the transaction. It's like parenthood—all decisions regarding data (the parent's children) are agreed upon between the data owner (the consumer) and the data producer (the bank). What is lacking today is the technology to quickly attribute the data and economic benefits of the use of this data to certain economic activities, such as marketing. If one organization (for example, a supermarket) is paying social media (for example, Google, Facebook, and Twitter) for consumer's data for marketing purposes, the data owner will be credited with a portion of the economic benefits. Without advances in data technology, mere legal regulations will not be practical.

The Open Bank Project

The world's most advanced policy that allows consumers to consolidate their own data is called the Open Banking Project. It started in the UK in 2016, following the European's Directive PSD2 – the revised Payment Services Directive ( This changed the competitive landscape of banks by lowering the entry barrier in terms of making use of banks' information for financial advisory reasons. This makes robo-advisors a feasible business as the financial data that banks contain is no longer segregated.

The challenge with this project is that the existing incumbent dominant banks have little incentive to open up their data. On the consumer side, the slowness in data consolidation impacts the economic values of this inter-connected network of financial data on banking services. This obeys Metcalfe's Law, which states that the value of a network is equivalent to the square number of connected users (in our case, banks). The following table analyzes the situation using Game Theory to anticipate the outcome for both banks and consumers—assuming that consumers have only two banks in the market with four possible outcomes:

Cell value = benefits of
bank A/bank B/Consumer

Bank B: Open Bank API

Bank B: Not Open Bank API

Bank A: Open Bank API



Bank A: Not Open Bank API



For the status quo (that is, without any Open Bank API), let's assume that both banks A and B will enjoy 1 unit of benefits while the consumers will also have 1 unit of benefits.

For any bank to develop an Open Bank API, they will need to consume 0.5 of its resources. Therefore, we will have two cells showing either bank A or B developing the Open Bank API while the other does not. The one developing Open Bank API will have fewer benefits since 0.5 of the original 1 unit will need to be spent as resources to maintain the API. In these two cases, consumers cannot enjoy any additional benefits as the data is not consolidated.

Only in the case where all banks are adopting the Open Bank API will the consumers see incremental benefits (let's assume there's one more unit so that there's two in total, just to be arbitrary), while both banks have fewer benefits. This, of course, could be wrong as the market as a whole shall be more competitive, which is what is happening in the UK with regard to virtual banking—a new sub-segment has been created because of this initiative! So, at the end of the day, all banks could have improved benefits.

Having said that, the reality for most incumbent banks is that they have to maintain two sets of banking services—one completely virtual while the other set of banking channels remains brick and mortar and not scalable. Perhaps the way forward is to build another banking channel outside of its existing one and transfer the clients there.

Since a truly ideal state hasn't been achieved yet, for the moment, to construct a Digital You, there needs to be data from the Open Bank Project(OBP) from the UK on financial transactions (, identity verification via Digidentity from the EU (, health records stored withIHiS from Singapore (, events and social data from Facebook, Twitter, Instagram, and LinkedIn, life events from insurance companies, and so on. In short, we still need to work on each respective system rollout before we unite all these data sources.

Smart devices – using APIs with Flask and MongoDB as storage

Your smart device is a personalized private banker: the software will interact with markets and yourself. Within the smart device, the core modules are the Holding and User Interaction modules. The Holding module will safeguard the investment of the user/customer, while the user's interactions and the user themselves are greeted and connected by the User Interaction module.

The Holding module handles the quantitative aspect of investments—this is exactly what we covered in the previous two chapters, but at a personal level—by managing the portfolio and capturing various market data. However, the difference is that we need to understand the user/customer better through behavioral data that's been captured in the User Interaction module. The Holding module is the cognitive brain of the smart device.

The User Interaction module provides, of course, the interaction aspect of a smart device—it understands the user's preferences on investment and interactions. These investment preferences are captured in the Investment Policy Statement (IPS). These interactions are then handled by the Behavior Analyzer, which analyzes the preferred time, channels, and messages to communicate, as well as the financial behaviors of the user regarding their actual personality and risk appetite, both of which are derived from the data that's obtained from the Data Feeds of external sources or user-generated data from using the device. Last but not least, the communication channels (Comm Channels) interact with the user either by voice, text, or perhaps physically via a physical robot.

This sums up nicely what we mentioned in Chapter 1, The Importance of AI in Banking, as the definition of AI—a machine that thinks and acts like a human, either rationally or emotionally, or both. The Holding module is the rational brain of a human and acts accordingly in the market, while its emotions are handled by the User Interaction module – sympathized by the Behavior Analyzer and how they interact via the Comm Channels. The following diagram shows the market and user interaction through banking functions:

Since we've already talked about the Holding module in the previous two chapters, which focused on the investment process, here, we'll focus more on the User Interaction module. Specifically, we will dig deeper into IPS, which records the investment needs of the user.

Understanding IPS

As we mentioned in Chapter 6, Automated Portfolio Management Using Treynor-Black Model and ResNet, we will start looking at the individual's investment policy statement here. To do this, we need to collect data to build up the IPS for an individual customer.

Here is what it takes to build an IPS for a family:

  • Return and risk objectives:



Return objectives

To be inputted by the investors and via the behavioral analyzer—personality profiling

Ability to take risk

To be inputted by the investors and via the behavioral analyzer—personality profiling

Willingness to take risk

To be inputted by the investors and via the behavioral analyzer—personality profiling

  • Constraints:




Liquidity of assets can be determined by the prices within the smart device.

Time horizon

Plan for your children's future – their studies (where and which school, how much, and so on), housing plans, jobs, retirement, and so on.

Taxes (both US citizens)

Citizenship via Digidentity.

Legal & regulatory environment

This could be implicated via commercial transactions, citizenship, employment, and residential constraints.

You may also need to consider the legal entity that will manage your wealth, such as a family trust.

Unique circumstances

Interests and special circumstances aren't made known, including social media or medical profiles that stand out from a standard user's – this needs to be compared across users anonymously to provide real, unique circumstances.

Behavioral Analyzer – expenditure analyzers

Similar to Chapter 2, Time Series Analysis, we are going to forecast the day-to-day cash flow in the market. Since the income for most cases (and even most of the population who are working on salary) are fixed on a monthly basis, the only moving parts are the expenditures. Within these expenditures, there could be regular spending costs for things such as groceries, as well as irregular spending costs for things such as buying white goods or even a car. For a machine to track and project regular spending habits, as well as infrequent spending, the practical approach is to record these habits efficiently when they occur.

Exposing AI services as APIs

While the portfolio optimization model we built in Chapter 6, Automated Portfolio Management Using Treynor-Black Model and ResNet, was great, the key technology that will be addressed in this chapter will demonstrate how AI models are wrapped and provided to users via API. With regard to technical modeling skills, we are not going to add any new techniques to our repertoire in this chapter.

Performing document layout analysis

In ML, there is a discipline called document layout analysis. It is indeed about studying how humans understand documents. It includes computer vision, natural language processing, and knowledge graphs. The end game is to deliver an ontology that can allow any document to be navigated, similar to how word processors can, but in an automated manner. In a word processor, we have to define certain words that are found in headers, as well as within different levels of the hierarchy – for example, heading level 1, heading level 2, body text, paragraph, and so on. What's not defined manually by humans is sentences, vocabulary, words, characters, pixels, and so on. However, when we handle the images taken by a camera or scanner, the lowest level of data is a pixel.

Steps for document layout analysis

In this section, we will learn how to perform document layout analysis. The steps are as follows:

  1. Forming characters from pixels: The technique, which is used to convert pixels into characters, is known as Optical Character Recognition (OCR). It is a well-known problem that can be solved by many examples of deep learning, including the MNIST dataset. Alternatively, we could use Tesseract-OCR to perform OCR.
  2. Image rotation: When the image is not located at the correct vertical orientation, it may create challenges for people to read the characters. Of course, new research in this area is occurring that seems to be able to skip this step.
  3. Forming words from characters: Practically, we cannot wait minutes upon minutes for this to happen; with human performance, we can get this right. How do we know that a character shall be banded together with other characters to form one word? We know this from spatial clues. Yet the distance between characters is not fixed, so how can we teach a machine to understand this spatial distance? This is perhaps the challenge most people suffering from dyslexia face. Machines also suffer from dyslexia by default.
  4. Building meaning from words: This requires us to know the topic of the paper and the spelling of the words, which helps us to check our various dictionaries to understand what the document is about. Learning (in terms of deep learning in this book) could mean just a topic related to education, and the reason we know that it because you understand that this book is a machine learning book by Packt – a publisher name that you learned about in the past. Otherwise, by merely reading the word Packt, we may guess that it is related to a packaging company (that is, PACK-t)? In addition, we also draw clues from the label words—step 3 itself looks like label words that introduce the actual content on the right-hand side of it.

Classifying words as various generic types of entities helps – for example, dates, serial numbers, dollar amount, time, and so on. These are the generic entities we typically see in open source spaces such as spaCy, which we used in Chapter 7, Sensing Market Sentiment for Algorithmic Marketing at Sell Side.

With regard to spatial clues in the form of words, we may understand the importance of larger words while paying less attention to smaller words. The location of the words on the page matter too. For example, to read English, we normally read from top to bottom, left to right, while in some other languages, we need to read from right to left, top to bottom, such as ancient Chinese.

Using Gensim for topic modeling

In our example of topic modeling, we will focus on step 4 to limit our scope of work. We will do this while we take the prework from steps 1 to 3 for granted and skip steps 5 and 6. The dataset image we will be using has already been cleaned, rotated, and OCRed – this included binding characters to form words. What we have at hand is a dataset with each record represented by a text block, which could include multiple words. Gensim is concerned with tracking nouns in text.

Vector dimensions of Word2vec

Word2Vec defines words by their different features – the feature value of each word is defined by the distance between the words that appear in the same sentence. It is meant to quantify the similarity between concepts and topics. In our example of Word2vec, we will use a pre-trained model to convert text into words. However, for each text block, there may be several values involved. In such a case, a series of these vectors would be compressed into a value, using a value called an Eigenvalue. We will use this simple approach to perform dimension reduction, which we do when we want to reduce the number of features (dimensions) of a variable. The most common approach to dimension reduction is called Principal Component Analysis (PCA). It is mostly applied to scalars, not vectors of variables. Imagine that each word is represented by a vector. Here, one text block with two words will be represented by a matrix composed of two vectors. Therefore, the PCA may not be an ideal solution for this kind of dimension reduction task.

While interpreting the vectors that represent the topic of the word, it is important to analyze the dimensions involved, as each dimension represents one semantic/meaning group. In our example of Word2vec, we'll skip this step to avoid putting too many dimensions into the meaningful extraction process. This means we'll have smaller feature spaces for illustration purposes.

Cash flow projection using the Open Bank API

In the future, we will need robo-advisors to be able to understand our needs. The most basic step is to be able to pull our financial data from across banks. Here, we will assume that we are customers of consumer banking services from the US who are staying in the UK. We are looking for wealth planning for a family of four—a married couple and two kids. What we want is a robo-advisor to perform all our financial activities for us.

We will retrieve all the necessary transaction data from the Open Bank Project (OBP) API to forecast our expenditure forecasting via Open Bank API. The data that we will be using will be simulated data that follows the format specified in the OBP. We are not going to dive deep into any of the software technologies while focusing on building the wealth planning engine. The family description we'll be using has been obtained from the Federal Reserve ( regarding American household financials.

The following table shows the typical values of households in the US, which helps us understand the general demand for consumer banking:


Value (In US $)

Data Source (OBS)



Salaries from working people


Largest auto-payment every month, with fixed salaries on a monthly basis.

Living expenses


Annual expenses


Retrieve all transactions from credit cards, savings, and current accounts.

Debt repayment


Transactions related to debt account.

Net worth




Financial assets


The outstanding balance of Securities account. No visibility of the 401 plan.

Non-financial assets


Housing valuation provided byZillow.





The outstanding balance of the debt account.

Auto loans and educational debts


Auto loan: Outstanding balance of debt account; student loan (federal), with the counterpart being Federal Student loan's; Student loan (private): Outstanding balance of debt account.

For more details about Zillow, please refer to this link:

Steps involved

To use the Open Bank API, we will need to do the following:

  1. Register to use the Open Bank API.
  2. Download the necessary data.
  3. Create a database to store this data.
  4. Set up an API for forecasting.

Let's get started!

Registering to use Open Bank API

There are several ways we can access the Open Banking Project—we will work on one such where we registered at

Creating and downloading demo data

The code for this section can be downloaded fromGitHub ( Based on the file from this repository, we have modified the program so that it downloads the required data. Use the following code snippet to download the demo data:

# -*- coding: utf-8 -*-

from __future__ import print_function # (at top of module)
import sys
import time
import requests

# Note: in order to use this example, you need to have at least one account
# that you can send money from (i.e. be the owner).
# All properties are now kept in one central place

from props.default import *

# You probably don't need to change those

#add the following lines to before running it
#add lines to download the file
print(" --- export json")
import json
f_json = open('transactions.json','w+')
json.dump(transactions,f_json,sort_keys=True, indent=4)

Creating a NoSQL database to store the data locally

I prefer MongoDB for this due to its ability to import JSON files in a hierarchical manner, without us needing to define the structure in advance. Even though we will need to store the NoSQL file in SQL database format (as we did in the previous chapter) whenever we need to run predictions with the ML model, it is still useful for us to cache the downloaded data physically before we run the prediction.

So, you may be wondering why we need to store it in a NoSQL database for our purposes – can't we just save it as we did in the previous chapter, when we handled tweet data? No – we want to use a database because for quicker retrieval, given that we will be storing hundreds of thousands of JSON files with an infinite number of days versus batch downloads. This also depends on how frequently we want to download the data; if we wish to update our databases every day, we may not need to store the JSON data in a NoSQL database as we wouldn't have very many files to deal with. However, if we are querying the data or continuously adding new features to the training dataset, we might be better off storing the raw data on our side.

The following code is used to establish our connectivity with the MongoDB server:

from pymongo import MongoClient
import json
import pprint

#client = MongoClient()
client = MongoClient('mongodb://localhost:27017/')
db_name = 'AIFinance8A'
collection_name = 'transactions_obp'

f_json = open('transactions.json', 'r')
json_data = json.loads(f_json)


#to check if all documents are inserted

The following code is used to create the database:

#define libraries and variables
import sqlite3
from pymongo import MongoClient
import json
from flatten_dict import flatten

client = MongoClient('mongodb://localhost:27017/')
db_name = 'AIFinance8A'
collection_name = 'transactions_obp'

db = client[db_name]
collection = db[collection_name]
posts = db.posts


#flatten the dictionary

#create the database schema
#db file
db_path = 'parsed_obp.db'
db_name = 'obp_db'

#sql db
sqlstr = 'drop table '+db_name
#loop through the dict and insert them into the db

for cnt in dict_cnt:
for fld in tuple_fields_list:
sqlstr = 'insert into '+ db_name+ '(' + str(fld_list_str)+') VALUES \

Setting up the API for forecasting

To perform forecasting for payments, we need to know what kind of forecasting model we want to build. Do we want a time series model or the ML model? Of course, we want to have a model that provides more information.

In our example, we have not prepared any model for this as the method we'll be using will be similar to the model we used in Chapter 2, Time Series Analysis. The main point to illustrate here is how to set up the API server and how to use another program to consume the API. Please make sure these two programs are run simultaneously.

The server will be set up to listen to requests so that it can run predictions. We will simply load the model without running any predictions. The following code snippet is used to connect us to the Open Bank API server:

from flask import Flask, request, jsonify
from sklearn.externals import joblib
import traceback
import pandas as pd
import numpy as np

# Your API definition
app = Flask(__name__)

@app.route('/predict', methods=['POST'])
def predict():

#Run the server
if __name__ == '__main__':

The following code snippet is used to create requests from the client application:

import requests

host = ''

r ='predict', json={"key": "value"})

Congratulations! You have built a robot that can read data from banks and have built it so that it can run AI models on this data.

For a household, it becomes critical to limit expenses to increase their cash flow. In the next section, we will look at how to track daily expenses using the invoice entity recognition technique.

Using invoice entity recognition to track daily expenses

While we are always dreaming for the end game of digitization through AI in finance, the reality is that there is data that's trapped. And very often, these expenses come in the form of paper, not API feeds. Dealing with paper would be inevitable if we were to transform ourselves into a fully digital world where all our information is stored in JSON files or SQL databases. We cannot avoid handling existing paper-based information. Using an example of a paper-based document dataset, we are going to demonstrate how to build up the engine for the invoice entity extraction model.

In this example, we will assume you are developing your own engine to scan and transform the invoice into a structured data format. However, due to a lack of data, you will need to parse the Patent images dataset, which isavailableat Within the dataset, there are images, text blocks, and the target results that we want to extract from. This is known as entity extraction. The challenge here is that these invoices are not in a standardized format. Different merchants issue invoices in different sizes and formats, yet we are still able to understand the visual clues (font size, lines, positions, and so on) and the languages of the words and the words surrounding it (called labels).

Steps involved

We have to follow six steps to track daily expenses using invoice entity recognition. These steps are as follows:

  1. Import the relevant libraries and define the variables. In this example, we're introducing topic modeling, including Word to Vector (Word2vec), using gensim, and regular expressions using re, a built-in module. The following code snippet is used to import the required libraries:
import os
import pandas as pd
from numpy import genfromtxt
import numpy as np
from gensim.models import Word2Vec
from gensim.models.keyedvectors import WordEmbeddingsKeyedVectors
import gensim.downloader as api
from gensim.parsing.preprocessing import remove_stopwords
from gensim.parsing.preprocessing import preprocess_string, strip_tags, remove_stopwords,strip_numeric,strip_multiple_whitespaces
from scipy import linalg as LA
import pickle
import re
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPClassifier
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import classification_report,roc_curve, auc,confusion_matrix,f1_score

#please run this in terminal: sudo apt-get install libopenblas-dev
model_word2vec = api.load("text8") # load pre-trained words vectors
  1. Define the functions that will need to be used later. There will be two groups of functions—one, 2A, is used to train and test the neural network, while the other, 2B, aims at converting the text into numeric values. The following code snippet defines the functions that will be used for invoice entity recognition:
#2. Define functions relevant for works
##2A Neural Network
##2A_i. Grid search that simulate the performance of different neural network design
def grid_search(X_train,X_test, Y_train,Y_test,num_training_sample):
##2A_ii train network
def train_NN(X,Y,target_names):
#2B: prepare the text data series into numeric data series
#2B.i: cleanse text by removing multiple whitespaces and converting to lower cases
def cleanse_text(sentence,re_sub):
#2B.ii: convert text to numeric numbers
def text_series_to_np(txt_series,model,re_sub):

  1. Prepare the dataset. In this example, we will try to use numpy to store the features as they're quite big. We will also use pandas for each file as it is far easier to manipulate and select columns using a DataFrame, given that the size of each image isn't too large. The following code snippet is used to prepare the dataset:
#3. Loop through the files to prepare the dataset for training and testing
#loop through folders (represent different sources)
for folder in list_of_dir:
files = os.path.join(path,folder)
#loop through folders (represent different filing of the same
for file in os.listdir(files):
if file.endswith(truth_file_ext):
#define the file names to be read

#merge ground truth (aka target variables) with the blocks

#convert the text itself into vectors and lastly a single
value using Eigenvalue
text_df = f_df['text']
text_np = text_series_to_np(text_df,model_word2vec,re_sub)

label_df = f_df['text_label']
label_np = text_series_to_np(label_df, model_word2vec, \
Y_pd = pd.get_dummies(targets_df)
Y_np = Y_pd.values
  1. Execute the model. Here, we execute the model we prepared using the functions we defined in previous steps. The following code snippet is used to execute the model:
#4. Execute the training and test the outcome
NN_clf, f1_clf = train_NN(full_X_np,Y_np,dummy_header)

Congratulations! With that, you have built a model that can extract information from scanned images!

  1. Draw the clues from the spatial and visual environment of the word. The preceding line clearly separates steps 4 and 5. Noticing how these lines are being projected also helps us group similar words together. For documents that require original copies, we may need to look at signatures and logos, as well as matching these against a true verified signature or stamp.
  2. Construct a knowledge map of these documents. This is when we can build a thorough understanding of the knowledge embedded in the document. Here, we need to use the graph database to keep track of this knowledge (we covered this in the previous chapter).

This concludes our example of tracking daily expenses, as well as this chapter.


In this chapter, we covered how to extract data and provide AI services using APIs. We understood how important it is to manage customer's digital data. We also understood the Open Bank Project and document layout analysis. We learned about this through two examples—one was about projecting cash flows, while the other was about tracking daily expenses.

The next chapter will also focus on consumer banking. We will learn how to create proxy data for information that's missing in the customer's profile. We also will take a look at an example chatbot that we can use to serve and interact with customers. We will use graph and NLP techniques to create this chatbot.