What is machine learning, and why does it matter?

What is machine learning?

Machine learning is about extracting knowledge from the data. ML is a subfield of artificial intelligence which empowers machines to learn from past data or experiences without being explicitly programmed. It enables automatic predictions or supporting decision-making processes by analysing tonnes of structured or semi-structured historical data.

Machine learning use cases. Where is ML used?

Machine learning can be used anywhere where a large amount of data can be taken advance to add intelligent application features or enhance existing processes.

Some of examples of leveraging machine learning technology include:

The increasing level of engagement on social media platforms. Facebook, Twitter, or Instagram have embedded ML mechanisms to provide users with further posts or fan pages that they could find interesting or groups they might want to join. The algorithms analyse data on current user’s amusements and others’ actions and prepare suggestions;
Product recommendation engine on e-commerce platforms. ML models can collate data on product availability and product range changes, promotions and special offers with historical records of user actions and information on activities taken at the moment;
Image recognition and insights derivation from unstructured data. These models can be used to classify images, detect objects, read handwriting, or even identify emotions from photos or videos containing human faces;
Text translation. The technology enables the translation of whole websites, pieces of content in online applications, or even text from the real world, like storefronts, menus, documents, or business cards, using a smartphone’s camera;
Biotechnology, medicine and diagnostics. Machine learning can support treatment in reconstructing the underlying mechanisms of disease, patient self-triage based on indicated symptoms, or cells or biopolymers structure prediction;
Detection of phishing, spam or suspicious behaviour. Analysis of past actions (such as mail usage or banking system transactions) enables ML mechanisms to track anomalies and warn users of a potential attack.

Machine learning models and algorithms

Deriving insights requires creating a model – a computer program with specific rules and data structures- trained to analyse data to find patterns or make decisions.

There are three main types of machine learning models:

supervised learning,
unsupervised learning,
and reinforcement learning.

Supervised learning

Supervised algorithms are trained on known or labelled datasets.

One example of the use of supervised learning is the prediction of prices in the property market. The model is trained using historical data on flat size, the number of rooms, amenities, garden, neighbourhood and other information. By having access to thousands of records, the model can estimate the price of a house with specific parameters.

Regression analysis

Regression analysis encompasses various statistical methods for estimating the relationship between input variables and their associated characteristics. Its most common form is linear regression, where a single line is drawn to determine the mean or trend.

The regression algorithm can be used to set, mentioned previously, prices in the property market. Regression predicts that although prices may vary, they always return to some average value. Using a regression algorithm, forecasts can also be made.

Classification

A classification algorithm enables grouping objects from a particular set and searching for patterns. Suppose you want to train a model to recognise whether a photo shows a dog or a cat. Having a vast collection of pictures of different species, you can group and label them (dog/cat) to train the mechanism. The algorithm will compare the input with the output and the photo with the animal species label. Eventually, it will learn to recognise a specific species in new photos from outside the training dataset.

Decision trees

A frequently used algorithm in supervised ML is the decision tree. It enables the classification of data based on categorical and continuous variables. For example, decision trees can be used in banking to automate initial credit allocation decisions. With customer data, values such as income, age, employment, criminal record or previous financial commitments can be included in the model to calculate the likelihood of a loan being granted and for what amount.

Unsupervised learning

Unsupervised ML algorithms learn from unknown data sets. They are mostly used in cases where the data is constantly growing, and it is impossible to label it to train the model manually.

An example is the detection of SPAM in inboxes – the model looks for patterns and ‘autonomously’ learns what junk emails look like.

Clustering

The clustering algorithm is similar to the classification algorithm, but in that case, the model works on data that would be hard for a human to group. It’s used to find structure, underlying processes, or features in a set of examples. Clustering is used, i.a., in the healthcare sector, to analyse medical images in searching for diseases.

Dimensionality Reduction

Dimensionality Reduction is an ML algorithm that serves to remove redundant data in order to leave those that are to be analysed or used to train the model. This could be the elimination of unnecessary columns from a spreadsheet or, for example, pixels from a graphic (which is used in facial recognition functionalities to increase the processing speed).

Anomaly detection

Anomaly detection is used to find unusual objects or behaviours. Finding these in the vast mass of data is impossible for a human but relatively simple for a machine. An example of this would be informing an application user of unusual behaviour on their account to track down a potential hack.

Reinforcement learning

Reinforcement models are those that are not pre-trained but learn from experience. They are used when no historical data is available or such data is not required for training. Reinforcement learning is especially successful with games such as Xs and Os, chess, or go.

Artificial intelligence, machine learning and deep learning

Both machine learning and deep learning are sub-fields of artificial intelligence.

Artificial intelligence is a larger concept with the premise of creating software that can simulate human thinking capability and behaviours. Artificial intelligence models don’t have to be pre-programmed; they use algorithms that can learn from the environment and take actions to achieve their goals successfully.

Artificial intelligence models:

perform tasks live human does but at a higher speed,
support solving complex problems,
deal with structured, semi-structured and unstructured data,
examples of AI use cases include Google Assistant, customer support chatbots, or online games.

Deep learning, on the other hand, is a subset of ML based on neural networks. An artificial neural network (ANN) is akin to neurons in the human brain. A neural network consists of three to five layers: an input layer, up to three hidden layers, and an output layer. Every neuron that receives information processes it and then transmits it to the next connected node to generate output data.

Deep learning models:

require large amounts of data to work effectively,
takes more extended training than ML models but gives more accurate results,
needs specialised GPUs to be trained,
is used in, i.a., computer vision analysis, speech recognition or machine translation.

Machine learning in the cloud

One of the key players in machine learning is Google, with its cloud computing services.

Google Cloud provides its users with ML tools that are divided into four areas:

machine learning APIs that are already trained on Google’s datasets and can be easily implemented into an application,
services to train models on user’s datasets,
platform to develop own models from scratch,
infrastructure tools that can be used to build and host machine learning models.

machine learning services on google cloud

Machine learning APIs

Those are services developed by Google on their data and released as APIs.

They include services such as Vision AI for image recognition, Natural Language that helps analyse text or perform sentiment analysis, video intelligence for content video analysis, translation API – a machine translation that lets you make your apps multilingual, or speech-to-text – an automatic speech recognition service.

You don’t need knowledge of building ML models to benefit from those services. Google Cloud handles the training aspect of machine learning, gathering data and building a predictive model. As a developer, you can call all these machine learning APIs from your code using client libraries using Python, Node.js, Java, Go, C#, PHP or Ruby.

AutoML

You can use AutoML platform to train a machine learning model using your datasets.

Google Cloud provides several AutoML products, i.a., AutoML Vision for training machine learning models to classify images according to labels defined by a user, AutoML Natural Language that categorises and analyses documents, or AutoML Tables to build and deploy models that work on structured data at high speed.

Let’s suppose you have a manufacturing company and want to introduce an automatic quality check process. You can upload images of previous damages and train your own model using AutoML Vision to recognise potential destructs on the production line.

Vertex AI

Vertex AI is a single, managed machine learning platform that lets you build, deploy and scale ML models. It integrates with open-source frameworks such as TensorFlow, PyTorch, or scikit-learn, and supports other ML frameworks via custom containers. The platform also features MLOps tools like Model Monitoring for monitoring the quality of your deployed models, Pipelines for orchestrating repeatable training and serving pipelines, and Features Store for organising, storing and serving features to interact.

Infrastructure tools

Google Cloud lets you use raw machines and tools to create and host your ML models. Infrastructure tools include Deep Learning VM Images, and pre-installed Compute Engine instances with the latest versions of ML frameworks like TensorFlow, PyTorch, or scikit-learn. There are also cloud GPUs at your disposal, which are great for speeding up computing jobs like machine learning, scientific computing and 3D visualisation. And there are also cloud TPUs which help you train and run machine learning models faster than before.

What is machine learning, and why does it matter?