Mass Customization of Client Lifetime Wealth – Hands-On Artificial Intelligence for Banking

Mass Customization of Client Lifetime Wealth

In the previous chapter, we learned how to manage the digital data of customers. We also covered the Open Bank Project and the Open Bank API. In addition, we learned about document layout analysis and looked at an example of projecting the cash flow for a typical household. Then, we looked at another example of how to track daily expenses using invoice entity recognition.

In this chapter, we will learn how to combine data from a survey for personal data analysis. We will learn techniques such as Neo4j, which is a graph database. We will build a chatbot to serve customers 24/7. We will also learn how to predict customer responses using NLP and Neo4j with the help of an example. After this, we will learn how to use cypher languages to manipulate data from the Neo4j database.

The following topics will be covered in this chapter:

  • Financial concepts of wealth instruments
  • Ensemble learning
  • Predicting customer responses
  • Building a chatbot to serve customers 24/7
  • Knowledge management using NLP and graphs

Financial concepts of wealth instruments

In this section, we will be answering a few questions asked by a consumer bank's marketers. Then, we will look at another important model development technique—ensemble learning—which will be useful in combining predictions from different models.

Sources of wealth: asset, income, and gifted

One of the most common tasks in retail banking customer analytics is to retrieve additional data that helps us to explain the customers' investment behavior and patterns. No doubt we will know the response of the customers, but the work of a model is to find out why they respond as they do. Surprisingly, there is a lot of aggregated information concerning the behaviors of individuals, such as census data. We can also find data from social media, where users use social media for authentication. The relevant social media information can then be chained together with individual-level transactional data that we observed internally in the organization. To explain individual banking behaviors, the most relevant supplementary data that we want is the information regarding their wealth.

Customer life cycle

A typical life cycle involves three major phases—acquisition, cross-selling/upselling, and retention. The following diagram illustrates these three phases:

Acquisition is when we start a commercial relationship with customers. Then, we move on to cross-selling and upselling. Cross-selling is about improving the number of products/services that are sold to the customer. Up-selling is about deepening the wallet share of the same product with the products/services. Retention is about keeping the relationship and is a defensive act by the bank to protect the relationship. Our first example (described in the following section) concerns cross-selling (if the customers do not have the product) and up-selling (if the customers own the product).

Ensemble learning

Ensemble learning is the boosting technique that helps us in improving the accuracy of the prediction. We will also learn how to use the graph database for knowledge storage. Knowledge storage is the current challenge in knowledge representation that can be used to empower AI for professional-grade financial services.

Ensemble learning is an approach that is used to summarize several models in order to give a more stable prediction. It was a very common approach before deep neural networks became popular. For completeness, we do not want to ignore this modeling technique in this very short book. In particular, we have used random forest, which means that we build lots of decision trees as a forest and we apply logic to cut down trees that have lower performance. Another approach would be combining the weaker model to generate a strong result, which is called the boosting method. We won't cover it here, but readers are encouraged to dig deeper in the scikit-learndocumentation (https://scikit-learn.org/stable/).

Knowledge retrieval via graph databases

To make a machine talk like a human in customer services, one of the key elements is the conversational component. When we engage in conversation, it is normal that human customers may not be able to provide the full amount of information required for processing. Humans can work with fuzziness. Humans can understand the context, and so can extrapolate meaning without the concepts being explicitly mentioned. Knowing that a machine can only solve definite problems while humans can work on fuzziness, it is the job of the machine to infer meaning from the knowledge map that it has for the customers. A graph database is used to serve this purpose.

Predict customer responses

So far, we have not talked about the day-to-day marketing activity of the bank. Now, we have finally come to look at how marketing prospects are determined. Even though each customer is unique, they are still handled by algorithms in the same way.

In this example, you will assume the role of a data scientist tasked with the marketing of a term deposit product. We are going to train the model to predict the marketing campaign for the term deposit. Data pertaining to the bank's internal data regarding customers and their previous responses to the campaign is obtained from the Center for Machine Learning and Intelligent Systems (https://archive.ics.uci.edu/ml/datasets/bank+marketing), the Bren School of Information and Computer Science, and the University of California, Irvine. Survey information about personal wealth is obtained from the US Census Bureau (https://www.census.gov/data/tables/time-series/demo/income-poverty/historical-income-households.html), which serves as an augmentation to the bank's internal data.

Solution

There are four steps to complete this example:

  1. We introduce random forest, which is a type of machine learning algorithm that utilizes ensemble learning, allowing predictions to be made by multiple models. The resulting model is a combination of the results from the multiple models. The following is the code snippet to import the required libraries and define the variables:
#import libraries & define variables
import pandas as pd
import os
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import make_classification
  1. Census data provides information about the deposit and wealth of the age group placed in the bank. The following is the code snippet to handle census data:
cat_val = ''
cat_dict = {}
for index, row in df_xtics.iterrows():
...

df_bank['age_c'] = pd.cut(df_bank['age'], [0,35,45,55,65,70,75,200])

#ID Conversions
df_bank['age_c_codes']=df_bank['age_c'].cat.codes.astype(str)
age_map={'0':'Less than 35 years'
,'1':'35 to 44 years'
,'2':'45 to 54 years'
,'3':'55 to 64 years'
,'4':'.65 to 69 years'
,'5':'.70 to 74 years'
,'6':'.75 and over'}
  1. We want to illustrate the mapping of one column's data, using age to introduce wealth data. The following is the code snippet to combine census data with the bank's data:
#3. map back the survey data
df_bank['age_c1']=df_bank['age_c_codes'].map(age_map)
df_bank['age_c1_val']=df_bank['age_c1'].map(cat_dict['Age of Householder'])

X_flds = ['balance','day', 'duration', 'pdays',
'previous', 'age_c1_val']
X = df_bank[X_flds]
y = df_bank['y']
  1. The following is the code snippet to train the model:
X, y = make_classification(n_samples=1000, n_features=3,
n_informative=2, n_redundant=0,
random_state=0, shuffle=False)
clf = RandomForestClassifier(n_estimators=100, max_depth=2,
random_state=0)
clf.fit(X, y)
print(clf.feature_importances_)

Congratulations! You have merged an external dataset with the internal dataset to augment our understanding of the customers.

Building a chatbot to service customers 24/7

When we interact with a robot, we expect it to understand and speak to us. The beauty of having a robot work for us is that it could serve us 24 hours a day throughout the week. Realistically, chatbots nowadays interact poorly with customers, and so we should try to break down the components of these chatbots to raise the bar to a higher standard. For an application-type development, you could use Google Assistant, Amazon's Alexa, or IBM Watson. But for learning purposes, let's break down the components and focus on the key challenges:

The chatbot performs two operations at a high level. One is to convert an input from voice to text, and another one is to translate an output from text to voice. Both of these operations involve extracting the entity and understanding the intent. In this example, the resulting text is an entity, whereas the meaning of the text is an intent. It represents a conversation between the service requester and the service provider. When faced with an incoming service request, the chatbot converts the voice instructions into text and adds context to the information received. Once the context building is done, the chatbot processes the information to generate the output in text format. The chatbot has to convert it into an audible voice output to be presented to the service requester. The whole scenario is explained in the preceding diagram.

Right now, let's focus on chat only, without worrying about voice recognition and utterance—that is, let's ignore voice to text and text to voice. In my opinion, since this task is machine- and memory-intensive, and the data is available in so many places, it is not for a start-up to work on this task; instead, we should leave it to a mainstream cloud provider with a strong infrastructure to deliver the service.

For text-only chat, the key focus should be on intent classification and entity extraction. While we have touched on entity extraction in the previous chapter, the input still needs to be classified before it is extracted. Intent classification works similarly to entity extraction, but treats the whole sentence as an entity for classification.

While it is very common to run a chatbot using ChatterBot or RasaNLU, we can break down the components to run from the bottom up.

Let's say that we are a simple bank that offers deposits and loans. We are building a simple chatbot that can serve existing customers only, and at the moment, we onlyhave two customers, one calledabcwith a deposit account, and another calledbcdwith a loan account:

Abc's deposit has an outstanding balance of 100 units and a pricing of 1, and bcd has an outstandingloan of 100 units and a pricing of 2.

Knowledge management using NLP and graphs

Essentially, there are two ways for us to retrieve and update knowledge about our real world. One is to store the knowledge in vector space and read the file to our memory during runtime using programs such as Word2Vector and BERT. Another approach is to load the knowledge into a graph database, such as Neo4j, and retrieve and query the data. The strength and weakness of both approaches lies in speed and transparency. For high-speed subject classification, in-memory models fare better, but for tasks that require transparency, such as banking decisions, the updating of data requires full transparency and permanent record keeping. In these cases, we will use a graph database. However, like the example we briefly covered in Chapter 7, Sensing Market Sentiment for Algorithmic Marketing at Sell Side, NLP is required to extract information from the document before we can store the information in graph format.

Practical implementation

The following are the steps to complete this example:

  1. Use the Cypher languages to import csv files into the database. We assume that the CSV file is dumped from the traditional SQL database. The following are the commands to be executed from the command line:
          sudo cp dataset.csv /var/lib/Neo4j/import/edge.csv
          
sudo cp product.csv /var/lib/Neo4j/import/product.csv
sudo cp customer.csv /var/lib/Neo4j/import/customer.csv
  1. Open the browser and navigate to http://localhost:7474/browser/. Then, create a username and set a password. This will be executed only once:
username: test, password: test
  1. Delete all nodes:
MATCH (n) DETACH DELETE n;
  1. Create customer data:
LOAD CSV WITH HEADERS FROM "file:///customer.csv" AS row
CREATE (c:Customer {customer_id: row.customer});
  1. Create product data:
LOAD CSV WITH HEADERS FROM "file:///product.csv" AS row
CREATE (p:Product {product_name: row.product});
  1. Load the CSV file:
LOAD CSV WITH HEADERS FROM "file:///edge.csv" AS line
WITH line
MATCH (c:Customer {customer_id:line.customer})
MATCH (p:Product {product_name:line.product})
MERGE (c)-[:HAS {TYPE:line.type, VALUE:toInteger(line.value)}]->(p)
RETURN count(*);
  1. Match and return the data:
MATCH (c)-[cp]->(p) RETURN c,cp,p;

Cypher is a language in itself; what we do is essentially create the product and customers. Then, we load another file that connects customers to products.

  1. We will connect to the Neo4j database that we just populated with data. The parameters follow the default setting. Please note the unique syntax of Cypher. In addition, the NLP model is loaded to be used for similarity analysis of the inputted instruction. The Cypher queries are stored in a dictionary. After the intent is read, the query string is retrieved. Then, we build the knowledge using the graph database:
#import libraries and define parameters
from Neo4j import GraphDatabase
import spacy

#define the parameters, host, query and keywords
uri = "bolt://localhost:7687"
driver = GraphDatabase.driver(uri, auth=("test", "test"))
session = driver.session()

check_q = ("MATCH (c:Customer)-[r:HAS]->(p:Product)"
"WHERE c.customer_id = $customerid AND p.product_name = \
$productname"
"RETURN DISTINCT properties(r)")
...
intent_dict = {'check':check_q, 'login':check_c}

#list of key intent, product and attribute
product_list = ['deposit','loan']
attribute_list = ['pricing','balance']
intent_list = ['check']
print('loading nlp model')
nlp = spacy.load('en_core_web_md')
  1. Users should be authenticated and identified properly using the SQL database. For ease of illustration, we will use GraphDatabase, but it is quite clear that using GraphDatabase for authentication is not right because we want to store a huge amount of data with usernames and passwords in a dedicated table whose access rights we can set to fewer individuals than the total number of people on the database. The following is the code snippet to authenticate the user:
if name == '' or reset:
name = input('Hello, What is your name? ')
print('Hi '+name)
#check for login
query_str = intent_dict['login']
result = session.read_transaction(run_query, query_str, name, \
product, attribute, attribute_val)

Sentence intent and entity extraction utilizes spaCy on similarity analysis. Based on a pretrained word-to-vector model, the reserved words on intents and entities are compared with the inputted sentence to extract the relevant intent and entities. The model is overly simplified as readers are left with a lot of creative space to enhance the extraction works by using a better language model, such as BERT, on the assumption that we have made the relevant model to perform the relevant classification task.

The following is the code snippet to extract entities and add intent:

#Sentences Intent and Entities Extraction
input_sentence = input('What do you like to do? ')
if input_sentence == "reset":
reset = True
entities = intent_entity_attribute_extraction(nlp, input_sentence, \
tokens_intent, tokens_products, tokens_attribute)
#actually can build another intent classifier here based on the scores and words matched as features, as well as previous entities
intent = entities[0]
product = entities[1]
attribute = entities[2]
attribute_val = entities[3]

Cross-checking and further requesting missing information

The program will continuously ask for intents, products, and attributes until all three pieces of information are clear to the program. Underneath the classification of each of these parameters, we deploy Word2vec for simplified classification. In fact, we can run a best-in-class topic classification model, such as BERT, to understand the languages and topics.

The following is the code snippet to request missing information from the user:

while intent == '':
input_sentence = input('What do you want to do?')
entities = intent_entity_attribute_extraction(nlp, input_sentence, \
tokens_intent, tokens_products, tokens_attribute)
intent = entities[0]

while product == '':
input_sentence = input('What product do you want to check?')
entities = intent_entity_attribute_extraction(nlp, input_sentence, \
tokens_intent, tokens_products, tokens_attribute)
product = entities[1]

while attribute == '':
input_sentence = input('What attribute of the ' + product + \
' that you want to '+intent+'?')
entities = intent_entity_attribute_extraction(nlp, input_sentence, \
tokens_intent, tokens_products, tokens_attribute)
attribute = entities[2]

Extracting the answer

When all information is filled in, the Cypher query will be executed and the information will be presented to the user. The following is the code snippet to extract the answer:

#execute the query to extract the answer
query_str = intent_dict[intent]
results = session.read_transaction(run_query, query_str, name, \
product,attribute,attribute_val)
if len(results) >0:
for result in results:
if result['TYPE'] == attribute:
print(attribute + ' of ' + product + ' is '+ \
str(result['VALUE']))
else:
print('no record')

Sample script of interactions

The following snippet shows the users' output and input. It is meant to show that the NLU can indeedextract intent and entities using closely associated words, thanks to the spaCy dictionary that allows us to find similar words. The whole point of the example is to show that for decisions requiring complete information before they are made, the graph database allows us to manage the dialogue and follow up with the missing information before any instructions are executed to serve the user.This is a very important feature when it comes to making professional decisions where we need its rationale to be transparent to a high degree of accuracy, as far as the machine can understand the language. The following is a snippet of the sample conversation of the chatbot:

loading nlp model
Hello, What is your name? testing
Hi testing
Failed to find testing
Hello, What is your name? abc
Hi abc
What do you like to do? do sth
matching...

What do you want to do?check sth
matching...
check
What product do you want to check?some product
matching...

What product do you want to check?deposit
matching...
deposit
What attribute of the deposit that you want to check?sth
matching...

What attribute of the deposit that you want to check?pricing
matching...
pricing
pricing of deposit is 1

Congratulations! You have built a very simple chatbot that can show you the core functionality of chatbots.

The example we are using is a very good echo of what we start with in commercial banking in terms of using borrowers' and depositors' data using reinforcement learning. Back then, the data was stored in variables at runtime. Right now, we have demonstrated another possibility for storing the data in graph data. Indeed, compared to the example in Chapter 3, Using Features and Reinforcement Learning to Automate Bank Financing, the speed of reinforcement learning will be slower if we were to store data in a graph database rather than variables in a Python program. Therefore, we will use a graph database, but only for production and application levels when individual dialogues can tolerate some delay compared with a computation-intensive training phase.

Summary

In this chapter, we learned about NLP and graph databases and we learned about the financial concepts that are required to analyze customer data. We also learned about an artificial intelligence technique called ensemble learning. We looked at an example where we predicted customer responses using natural language processing. Lastly, we built a chatbot to serve requests from customers 24/7. These concepts are very powerful. NLP is capable of enabling programs to interpret languages that humans speak naturally. The graph database, on the other hand, is helpful in designing highly efficient algorithms.

In the next chapter, we will learn about practical considerations to bear in mind when you want to build a model to solve your day-to-day challenges. In addition, we also want to look at the practical IT considerations when equipping data scientists with languages to interact with IT developers who put the algorithm to use in real life.