Real-World Considerations – Hands-On Artificial Intelligence for Banking

Real-World Considerations

This chapter serves as the conclusion of the book. It wraps up the near-term banking world we will soon be living in. We will also add some useful tips on the considerations required to incorporate these AI engines in day-to-day production environments. This part corresponds to the business understanding step of the CRISP-DM, the approach for implementing any data mining project that we introduced in Chapter 1, The Importance of AI in Banking.

In this chapter, we will first summarize the techniques and knowledge that you learned throughout chapters 2 to 9, and then we will cover the forward-looking topics that will be an extension of our journey in banking. These are the topics and knowledge that will be covered:

  • Summary of techniques covered
  • Impact on banking professionals, regulators, and governments
  • How to come up with features and acquire the domain knowledge
  • IT production considerations in connection with AI deployment
  • Where to look for more use cases
  • Which areas require more practical research?

Summary of techniques covered

Following along the business segments of banking, we have covered quite a lot of data and AI techniques. We have also gone through the models with minimal use of complex formula or jargons.

AI modeling techniques

We have covered statistical models, optimization, and machine learning models. Within machine learning models, we covered unsupervised, supervised, and reinforcement learning. In terms of the type of data the supervised learning models run on, we covered structured data, images, and languages (NLP). With regard to data processing, we have also covered a number of sampling and testing approaches that help us. We will now recap the AI modeling techniques covered in the book so far:

  • Starting with supervised learning, this is a technique of labeling the input data prior to processing. The model is built to learn from the labels so that labeling will be done automatically with the next set of input data. Unsupervised learning, on the other hand, does not label the input data. It does not have labeled or trained data. Processing is effected by identifying objects based on patterns and repetitions.
  • Reinforcement learning is based on reaching the next immediate goal and assessing the distance from the final goal. This technique requires immediate feedback or input from the user to reach the final goal.
  • An artificial neural network is a concept that mimics the neural network in the human brain. The neurons in the human brain are represented by nodes in the artificial neural network.
  • Deep learning is one of the areas in machine learning and artificial neural networks. Deep learning algorithms use multiple layers to extract higher-level information from the raw input data.
  • CRISP-DM is a standard for data mining. It stands for cross-industry standard for data mining. It provides a structured approach to planning data mining and data analysis projects.
  • Time series analysis is a prediction technique that relies on historical data captured at a specific interval in time. In time series analysis, we decide on an observation parameter and capture the values of the parameter at a specific time interval. An example of this could be monthly expenses captured by a branch of a bank.
  • NLP is concerned with the conversation between human languages and machine languages. A speech-to-text engine capable of understanding and interpreting human voice and perform commands can be an example of NLP.
  • Finally, ensemble learning uses multiple machine learning algorithms to obtain better predictions in comparison to predictions obtained by using single machine learning algorithms.

Impact on banking professionals, regulators, and government

We have embarked on a long journey through commercial banking (Chapter 2, Time Series Analysis and Chapter 3, Using Features and Reinforcement Learning to Automate Bank Financing), investment banking (Chapter 4, Mechanizing Capital Market Decisions and Chapter 5, Predicting the Future of Investment Bankers), security sales and trading (Chapter 6, Automated Portfolio Management Using Treynor-Black Model and ResNet and Chapter 7, Sensing Market Sentiment for Algorithmic Marketing at Sell Side), and consumer banking (Chapter 8, Building Personal Wealth Advisers with Bank APIs and Chapter 9, Mass Customization of Client Lifetime Wealth) within the banking industry. This section accompanies a sample corporate client—Duke Energy—on its journey from commercial banking through to investment banking. In investment banking, we begin by introducing the investment communities who are on the buying side of the securities issued by corporate, before shifting to the investment side completely in Chapter 6, Automated Portfolio Management Using Treynor-Black Model and ResNet and Chapter 7, Sensing Market Sentiment for Algorithmic Marketing at Sell Side. While we are on the topic of investment, we continue the topic through to the last two coding chapters—Chapter 8, Building Personal Wealth Advisers with Bank APIs and Chapter 9, Mass Customization of Client Lifetime Wealth—by zooming in on the wealth management perspective.

The final chapterhelps us to focus on thedata aggregation issueat the client end. Essentially, all of the bank's clients – individuals, corporations, and institutions—will ownand manage their data in a centralized manner in order to cultivate theirown asset—data.

Consumer markets help us to see the various types of components that are designed to push the frontier of data management. While in the case of institutions and corporations, the data pertaining to the legal entity is significantly more complex, a better knowledge management model is required to organize the depth of data describing the corporations.

In fact, the manner in which we organize business- and institution-related data is a topic that is no longer discussed in the era of knowledge management, even though it was once a thriving subject in business schools in the early 2000s.

Models and frameworks were proposed in the sphere of knowledge management, but the technological solutions designed to harvest this knowledge with minimal effort are lacking on the part of many organizations. Working in a consulting firm back then, I witnessed the pain of maintaining the body of knowledge, which is the core asset of the consulting industry—business know-how. Now, we will return to this topic as we are going to make our machine smart and make financial decisions at an expert level. The boundaries of robots will be expanded when we can explicitly maintain them in the name of data quality management.

Implications for banking professionals

Never-ending debates are taking place regarding the changes in business models currently used by banks.My personal view is that the core decisions regarding forward-looking riskand return forecasts remain. Ultimately, we need a bank to fund future activity that is not completely certain at the time a financial decision is made. We are also making a decision regarding our future wealth, which we also do not have full control of. What has changed is the speed of update of this risk and return view as a result of machines and the rapid increase in the number of these expert decisions being left to laypeople.

Looking at the banking industry today, it is not just banking professionals, but also governments, regulators, and consumers who will push the boundaries together to create more efficient financial markets, with a free flow of data with clear ownership and technologies to attribute the values provided by the data in all walks of decision making.

Across industry,Open API remains a pain point for incumbents, while it is a topic where new entrants are crying foul, according to Michel E. Porter's Five Forces, in terms of the competitive positioning of companies.

Alongside the open bank API, which is just a gateway to get data, no worthwhile discussion is taking place on banking data standards for customers and clients. Indeed, APIs appear fashionable, but they also pose a huge problem in terms of dealing with how data is stored by customers for their own benefit. It seems less than ideal for individual customers to consider this topic, since the usability of the data will be polarized between those who can organize and those who cannot.

Implications for regulators

Gone are the days when instructions to banks were best-effort; models and validation are now in place to guarantee the quality of services and to protect investors' interests. Investors need to know the features that can create volatility in financial valuations. Perhaps timeliness becomes a key instead of expecting banks to have definite views about what might happen? The probability regarding risky events set by banks can be validated.

Implications for government

How do we provide a technology to allow individuals to own their data? How is the government taking the lead when it comes to storing an individual's identity as well as all their footprints? Personal data standards would help to reduce the economics of sharing, storing, and having individuals manage their own data.

GDPR in Europeis a good regulation, but it essentially lacks the technology required to execute it, as in the case of knowledge management. Likewise, data that describes interactions between companies, corporations, institutions, and public markets will be considered utilities provided by the government, as stock exchanges own the data.

Following the philosophy of public benefit, the objective is not to make huge profit but to provide public services that facilitate other economic activities.I believe more government intervention is warranted with regard to how public domain data is distributed.This will give robots an easier environment in which to work as the data availability issue underpins the entire book. It is not hard to see how it creates a bigger drag on AI adoption in all walks of economic activity. Counterbalancing this openness in relation to data, again, we require better control of the data—for companies and individuals alike.

The open data movement has been a buzzword in terms of allowing data to be available. Open government data may touch on issues such as stock exchange data, which is sometimes run by quasi-government organizations under specific licenses or regulations. Likewise, open bank data is also driven by global financial regulators as a driving force to provide bank customers with their own data.

At a practical level, data is a key ingredient for AI, and in some cases, it is cleansed, maintained, and provided as a benefit that costs taxpayers money! However, the resources spent on data maintenance also depend on the AI use cases that generate cost savings in terms of automated and better decision making. By this simple logic, someone has to pay: either through a shared pool in the government budget (that is, taxpayers, including you) or those who use the data (you again!). And one of the challenges in terms of data being accessible is to track its usage. It is considerably easier if we want to ask anyone who has used the data to pay on the particular data point down to the field level.

AI can be the electricity of tomorrow, but data will first be provided as the electricity of today.

How to come up with features and acquire the domain knowledge

In all the chapters so far, we have not explained where we get this domain knowledge from. A typical AI project requires us to slip into the shoes of finance professionals. Where to begin? The following is a list that will help you:

  • Textbook and training courses: The easiest path to follow is to follow how these professionals are trained. These courses contain the jargon, methodologies, and processes designed for the respective work type.
  • Research papers in banking and finance: When it comes to finding the right data, research in finance and banking can prove to be a very valuable resource. It will not only show where to get the data; it will also showcase those features with strong powers of prediction. However, I normally do not get lost in the inconsistency of features across authors and markets. I simply include them all as far as possible—with the support of theory by researchers.
  • Observing the dashboards: BI captures the key features that are meaningful to human users. The data fields used in those BI reports are good features that can vividly describe the problems as stipulated by human experts.
  • Procedural manuals: In case you are working in an established organization with processes and procedures in place, it serves as a valuable source to describe how humans work, especially those processing intensive works.
  • Institutions: Some say design thinking, some say it is just putting yourself into the shoes of others, by trying to work on the task in a hypothetical manner.
  • Observing child development: In case it is related to tasks such as perceiving and communicating information, we can observe how humans learn to build up the components and understand what a neural architecture should look like.
  • Looking for Excel: Excel has become a dominant tool in business schools, and is a semi-standardized form of decision making, especially in the area of financial modeling. This serves as a good starting point to understand how humans make decisions and the complex rules associated with doing so.

The preceding points cover business domain considerations, but we also need to consider the IT aspect of rolling out the model.

IT production considerations in connection with AI deployment

AI is just a file if the algorithm is not run in the day-to-day decision making of banks. The trend, of course, is to provide AI as a service to the software developers who write the program. This aside, there are a list of items that require the following:

  • Encryption: Data is key and all the AI runs on sensitive data. Even though the data is anonymizedsomewhat with the scalers that change the data into the range of zero to one. Encryption remains important, however, in making sure that the encryption is in place when the data is in transit via the network and remains with an encrypted database.
  • Load balancing: Handling requests with the correct capacity to handle, as well as creating sufficient servers to run the algorithm, are required. With the trend of going serverless with a cloud provider, the issue appears to have abated somewhat. However, the issue still remains; it is just being outsourced. Being an engineer, having an appreciation of capacity and how to handle loading is about the level of service. We want a smart robot that is always available, instead of having a smart engine that disappears when people are in desperate need. To do so requires an appreciation of usage traffic, hardware and software capacity planning, as well as a means to execute it alongside the traffic change.
  • Authentication: Organizations normally have their own preferences for regarding authentication. It can have quite an impact on customer experiences while security remains a concern.

Where to look for more use cases

AI applications listed in this book largely focus on front-office banking services; the back-office processing jobs are not covered in any great detail. Stepping back, where should you look out for opportunities in case you wish to start your own project?

  • Long hours; boring job: Boring means repetitive, and that's where machines thrive and data is rich.
  • Large labor force: When it comes to business cases, it is easy to look for jobs that have high levels of employment. This means a huge business potential and easy-to-justify implementation. This constitutes a huge challenge for HR professionals.
  • High pay: If we were to make finance accessible, can we make these highly paid jobs even more productive? In the case of investment bankers, security structurers, and hedge fund traders, how can their non-productive time be reduced?
  • Unique dataset: If the dataset is not accessible to outsiders, the chance of the domain not being looked at is high since researchers and start-ups cannot detect this problem.

Which areas require more practical research?

In certain areas, this book has hit the ceiling of research, and these are the research areas that could help move AI applications in banking:

  • Autonomous learning: AI will be replacing the works of AI engineers—given that the machine will be able to learn. Given the wealth of data nowadays, the machine will adopt its network structure itself.
  • Transparent AI: As the machine starts to make decisions, humans will demand transparency as regards the decision-making process.
  • Body of knowledge: In the case of expert knowledge, further research will look at how organizations can use AI to generate the body of knowledge. Practically, the Wikipedia form stored in BERT or any language model is not intended for human consumption or knowledge cultivation. And how do we squeeze the knowledge map to form a neural network, and vice versa?
  • Data masking: To allow data to travel and exchange freely, a flexible data-masking mechanism that preserves distribution characteristics within a field and in between data fields is important. It allows data to be shared with researchers or even open sourced for attack by smart data scientists. A secondary question in connection with a good masking mechanism is whether data owners can share their data with research communities in order to work on real-world challenges? Is this regarded as a donation and is therefore tax deductible?
  • Data aggregation and standardization: As covered earlier in this chapter, this describes how client data is standardized and how individuals and companies are allowed to own and manage their data.
  • Cross-disciplinary task focus applications: To uncover more research topics, it is very important for researchers from different disciplines to work together in solving a task focus problem, rather than working on dataset that is meant to tackle a single research topic.
  • Data life cycle technologies: Since data is used in so many places, and is altered, updated, and copied across systems, do we have the right technology to keep track of all of these movements? Once we can track its movement, we can then attribute the values to the contributors of data in the supply chain to incentivize data production. Some advocate blockchain, but saving huge amounts of data on blockchain does not seem practical.


This book is designed to illustrate the current possibilities in terms of technology using public domain information. I hope that it helps to create a supply of talent and researchers to aid the industry. It also creates a base level of performance that any potential start-up needs to beat in order to be qualified. With all code in books now being open source, I am happy to be at the bottom of performance with regard to technological solutions!

There are too many books that are purely visionary, and there are also books that talk about the technical considerations without getting to the heart of the problem. There are books full of mathematics and formulas that deal with encrypted knowledge. This book is here to bridge the gap between the vision and the technical considerations. I believe that future generations deserve to study in an AI utopia before they come to change our world of work. For professionals in the industry and those wishing to upgrade their skills, I hope that this book can suggest a number of areas of interest to consolidate further.

This is just the beginning.