Syoncloud Big Data for Retail Banking
Syoncloud offers comprehensive Big Data / Data Science solution for retail banks.
It covers areas such as:
- Individualization of product offers to existing clients
- Early fraud detection and fraud damage mitigation
- Prediction of products cancellations and client's defections
- Optimal allocation of cash to ATMs and bank branches
- Minimization of usage of expensive bank channels such as branch visits
- Reliable assessment of clients for debt products
The key advantage of Syoncloud is to collect full datasets from backups and relational databases into Hadoop environment and apply ETL to standardize, cross link and validate data. Syoncloud applies machine learning technologies to find hidden patterns and correlations in data. Build in machine learning technologies use these patters to predict behaviour of customers, frauds, allocation of resources and business opportunities.
Common Datasets are used as a foundation for complex analysis.
Common Datasets for Analysis Related to Bank's Clients
Syoncloud creates a dataset of monthly expenses and incomes categories for all clients, all their accounts and complete history. This dataset is created from bank accounts movements, direct debits and standing orders. Each account movement is usually accompanied with type of movement code such as electricity, phone bill, restaurant type code and so on. It also uses merchant's name, description and comment fields to categorize each transaction. Direct debits and standing orders do have similar type codes.
Syoncloud recognizes several categories of expenses such as housing expenses (rent or mortgage), energy expenses (gas and electricity), food and household related expenses, education (schools, books, courses), car expenses (fuel and repairs), restaurants, big ticket items (TV, furniture), taxes, recreation and hobby, credit card and loan payments, luxury items and so on.
Income categories are salaries, dividends, tax refunds, social benefits, rental income, sales and so on. Simple regression analysis of this dataset gives us overall trends for total expenses, incomes and savings as well as detail trends for each category of incomes and expenses for each client.
Machine Learning and Predictions
Syoncloud uses full range of machine learning algorithms and models to make predictions. There are two broad categories supervised and unsupervised algorithms.
Supervised learning algorithms use historical data to learn that certain combinations of input values cause certain output values. Syoncloud creates models that are trained and verified on samples of historical data. Sample data can be chosen randomly but we have seen better results if datasets are categorized first. In case of customer dataset Syoncloud creates categories such as age, income, location based on town size, education and savings. Each category is split into brackets. For example age category is split into 20 five years age brackets. We can see number of customers in each age bracket so we can sample 5% of records from each age bracket. Syoncloud does the same to other categories. These samples are ideal to see which categories make largest contribution to overall results. For example we can see that education makes largest contribution to accept certain investment product.
Unsupervised machine learning algorithms look for unknown patterns in available data.
For example Syoncloud finds patterns of unusual behaviour of clients to find early signs of frauds. In past fraud detection was limited to statistical analysis of behaviour that was common for all clients or large groups of clients. Syoncloud uses unsupervised learning models to find patterns that surface even in small numbers of records.
Individualization of Product Offers
Syoncloud enables individualization of product offers to existing clients. Banks save money on expensive broad marketing campaigns for bank products. Products will be offered only to customers that need them and are likely to accept them. Customers should see less of irrelevant offers. This requires deep knowledge who accepted given products in past.
Syoncloud uses datasets of subscriptions to bank products and services for each client as well as historical values. It also uses common dataset of incomes and expenses categories for each client and CRM data about clients. Syoncloud creates separate model for each product and subscription. It chooses and verifies the best learning algorithm and finds which categories and variables do have the biggest influence.
Early fraud detection and fraud damage mitigation
This feature includes detection of identity frauds, credit card frauds, wire frauds, attacks on internet and mobile banking and money laundering. New types of frauds and new schemes require flexible and fast detection algorithms. In past banks used only statistical and rules based algorithms to find whether suspicious activity is taken place. These algorithms were limited because they can only recognize known frauds, they require expensive maintenance, they do not work with full history of each client and they have high level of false positives.
Syoncloud utilizes dataset of known fraud cases. It sorts fraud cases into several categories such as overdraft fraud with stolen identity, stolen credit card, consumer loan fraud, credit card top up with fraudulent check, stolen checks, skimming with card duplication, attacks on online banking with stolen customer's credential and/or security devices, rogue online merchant frauds using credit cards and so on. Syoncloud uses neuronal networks with back propagation, decision tree algorithms and classification to find patterns and unknown occurrences of these frauds in our existing data.
Prediction of Product Cancellations and Client's defections
A prediction of bank products cancellations and client's defections is very time sensitive. Bank has just days to act before client irreversibly decide to cancel a product or move to competition. Bank needs to identify clients who are likely to defect, contact them and pro-actively offer alternative products or solve client's issues. It is much cheaper to retain highly profitable clients than to attract them back.
Syoncloud uses account movements, debit and credit card movements, clients dataset from CRM, product subscription dataset, call centre and branch visits transactions and log information as primary data sources for predictions. It also utilizes common datasets of incomes and expenses.
Syoncloud creates timeseries of key events such as direct debits cancellations, income to the account from salaries, dividends and rents, transfers to client's accounts at different banks, call centre and branch contacts made by the client separated into categories, cancellations of credit cards and so on.
Syoncloud selects another set of clients that do match categories such as age, income, saving and location for the same time interval but who still remain clients.
Based on these input datasets it creates models that are able to predict behaviour of clients before they irreversibly decide to move to competitors. It uses several supervised learning algorithms such as Support Vector Machines for binary classification and Neural Network with Backpropagation for predictions. From unsupervised machine learning algorithms it utilizes K-Means and Mean Shift Clustering after Principal Component Analysis was applied to reduce dimensions of input data.
Syoncloud identified several hundreds profitable clients in recent data who match patterns of clients who moved their accounts to competitors. These clients should be contacted by their respective bank branches.
Optimal Allocation of Cash for ATMs and Bank Branches
Demand for cash is highly variable during year at many ATMs and bank branch locations. The variability is caused by weather, local events, vacations, tourism and so on. It is important to predict right amount cash that needs to be deposited into ATMs as well as bank branches. It is costly to service ATMs too often, it is also costly to have cash machines out of order due lack of cash. In the same time we want to limit amount of unnecessary cash that is stored for long times in ATMs and bank branches. It leads to suboptimal cash allocation as well as it attracts crime.
As the primary datasets Syoncloud uses ATM service logs, geographic locations of ATMs and bank branches, withdraws dataset for each ATM, weather reports for ATMs and bank branch locations, schedules of sports, cultural or other events as well as holidays for all locations. Syoncloud also utilizes credit and debit card movements to assess demand for cash at various locations and during different times of the year. It uses common datasets of incomes to see when salaries, social benefits and other incomes arrived to client's accounts at different locations.
Syoncloud creates dataset of median amounts of cash withdraws for each day of the year and hour of day for all ATMs. This dataset is used to calculate influence of weather, events, day of the week or holidays on demands for cash at given location.
Syoncloud utilizes dataset of significant cultural, sport and other events during past 4 years with location coordinates. It calculates influence of each event on cash demand for all ATMs that are in 100m radius of given event. It is able to sort all events based on influence on cash demand. This dataset is used for predictions of influence of similar events.
Syoncloud also calculates correlation between local weather parameters such precipitation, temperature and wind at location of each ATM with cash demand.
Syoncloud creates correlation dataset between days when clients receive incomes, such as salaries and social benefits, and cash demands at different locations.
It creates models that can predict cash demand for each day of the year for each ATM and bank branch location. These models take into account historical weather forecast data and schedules of events. Syoncloud utilizes algorithms such as Restricted Boltzmann Machine, Perceptron and Gaussian Discriminative Analysis.
Minimization of Usage of Expensive Channels
Syoncloud can help minimize usage of expensive bank channels such as over-the-counter operations and other visits of bank branches as well as calls to call centres.
This can be achieve by optimizations of online banking and mobile banking applications, help pages and wizards as well as optimization of web pages on bank's websites. Another way to encourage reluctant clients to switch to cheaper channels is by targeted campaigns.
The primary sources of data for analysis are web log files from online banking application as well as mobile banking applications. Syoncloud also uses bank accounts movements with codes of bank channels, dataset of call centre transactions, CRM dataset with information about customers and dataset of transactions from bank branches.
Another important dataset is complains and enquiries from call centre, emails, letters and branches. Syoncloud sorts this datasets by areas of interest and correlates them with help web pages. It is able to identify help pages that are unclear and caused confusion and unnecessary calls to call centre. It also identifies operations in online banking that are complex and generated higher amount of complains. It uncovered several areas related to exchange rates during credit cards payments that were not covered by help pages but were often discussed over the phone or even by bank branch visits. Changes made to bank products related web pages, self helps, search optimizations, online banking operations and mobile banking applications can bring quick savings on outsourced call centres and bank branch visits.
Syoncloud analyses results from marketing campaigns to move reluctant clients to online and mobile banking or self-serving kiosks. It used correlation analysis and uncovered that some broad marketing campaigns were not efficient. Syoncloud analyses patterns of bank clients who recently moved most of the operations online. This gave us a tool to select portion of clients that are more likely to move online. These customers should be targeted by personalized marketing campaigns or by demonstration of advantages at bank branches.
Assessment of Clients for Debt Products
In order to reliably assess risks and approve debt products to existing clients we need take into account not just current credit scores and current disposable income of the clients but also complete history of the client as well as social context. This decreases risk for the bank as well increase income from valuable clients who would be otherwise rejected.
As a primary source of data Syoncloud uses common dataset of incomes and expenses, complete history of payment morale for credit cards, consumer loans, mortgages, overdrafts and other debt products and CRM information about clients.
It uses Markov Chain stochastic process to assess debt and payment morale related behaviour of clients. This model is tested on historical data of profitable and defaulted loans, credit cards and other debt products. We have noticed improved of reliability of credit scores and we were able to suggest suitable alternative debt products for rejected clients.