Neural Network Language Models and the future of Artificial Intelligence

November 11, 2021
Sachin Panicker, VP Product Engineering

What a time it is for us to be alive. We are in the throes of a revolution being led by Singularity. It all started with a paper titled ‘Attention is all you need’ presented at the Neural Information Processing Systems conference in 2017. Until then, language models were mostly restricted to Recurrent and Convolutional Neural Nets (RNN and CNN) and to their Long Short-Term Memory (LSTM) variation. But first off, let’s delve into what a language model is and why does it hold so much significance in the field of Artificial Intelligence (AI).

Imagine how powerful our systems can become if they can use unsupervised learning techniques to mine, extract and learn useful information from all spoken languages and any text ever written out there, including images, audio, and video. Imagine, what one could do if our systems became intelligent enough to analyze such data and use it to speculate or predict, categorize, and further process such data. Language models do precisely part of this with Natural Language Processing and predict the probability of a sequence of words in a sentence. They can help with translation, text summarization and classification, question answering et al.

Some of the first models used for language understanding tasks were based on RNN. An RNN is an Artificial Neural Network which can persist prior information, while an LSTM is a special RNN capable of learning long-term dependencies. What these models lacked, Transformers more than made up for it with a quite simple neural architecture based only on attention mechanisms and doing away with recurrence and convolutions in its entirety and thereby gaining more speed.

Artificial Neural Network
Source: Adobe Stock

A Transformer at its core uses the self-attention mechanism, and since it doesn’t operate sequentially, it can be parallelized on a Tensor Processing Unit Cloud or a Graphics Processing Unit and lend itself to high speeds of execution. They had become the darling of AI researchers, and their use had started getting prolific when along came Generative Pre-trained Transformer (GPT) models, which could use a few or zero-shot learning to train itself. What really took the world of AI by storm was the 3rd generation version of it, developed by OpenAI, an Artificial Intelligence company. It has 175 billion (B) parameters and has been trained on all the public internet.

OpenAI created quite the buzz with GPT-3, and the technology sector is abuzz with a GPT-4 scheduled for release this year. Riding on the success of GPT-3, OpenAI created Codex, an AI system that turns natural language into code. Now one can turn comments into code and explain and translate any code, and can complete one’s next line or function in context. It can bring knowledge to one, such as finding a useful library or API call for an application. One could add comments and rewrite code for efficiency and convert code from one language to another. It is way beyond code completion. It could evolve to code generation from scratch. And there are many more potential ways of utilizing it to automate any task.

Right now, we are seeing a race toward building larger language models. Be it either the Israeli Jurassic-1 with 178 B parameters or the Chinese Wu Dao with 1.75 Trillion (T) or GPT-4 itself with 175 T parameters; it will no longer be the monopoly of a Google or a Microsoft.

Artificial Intelligent
Source: Adobe Stock

What does all this mean for us? It has an impact in multifarious ways. In fact, the possible applications are so diverse that anything conceived in our minds will now be made possible using technology. Let’s pick a few scenarios. Picture this.

a. A Business analyst, while gathering the requirements, will be able to model systems, business processes, and workflows, and design interfaces in a live session with the Customer, thereby significantly shortening the turnaround time. One could also define acceptance test cases and write user stories by feeding examples in the form of preliminary test cases and stories.

b. A Quality Assurance engineer would be able to write test cases. For that matter, one could even automate the automation test scripting. The possibilities are endless.

c. A Developer would be able to use it to code faster and more accurately. One could use it for code completion or suggestion, relevant code snippet suggestions, or answers from Stack Overflow. Move over Java, C#, JavaScript. Use English as a programming language now. It can bring down Development and other tasks’ effort by more than 60% at a conservative best. Quick unit tests can be prepared by Developers. SQL queries can be written by Functional experts.

d. A User Interface/Experience (UI/UX) Designer will be able to create low-fidelity wireframes and high-fidelity Mockups on the fly while having live discussions with the Business Owners. Business Analysts and UX Designers can prepare prototypes while interviewing Customers and rework and finalize all of them in a single working session.

e. A DevOps Engineer will be able to automate the DevOps pipeline. For example, one could automate and create an entire Web project and docker commands to set it up and run containers.

The future will not only be about the speed of the execution but also about a universe of infinite possibilities. While we adapt to newer technologies, the best way to do so would be to unlearn and rewire our minds because only by doing so will we be able to develop a fresh outlook, unhindered by impressions accrued over time.

Businessman holding digital chatbot
Source: Adobe Stock

Now, to explore some industry use cases. This has profound implications in every Industry, wherein this could be used for anything. As people and businesses are experimenting more, they can define newer avenues of its application. Some of the most evident uses are a human-like intelligent Chatbot, a website or a Mobile App (or an App) Maker, etc.

More specifically, Insurance needs analysis and underwriting could benefit from the deployment of intelligent software agents that could just chat with a User, complete the process, and give better results. A fintech company could use it to detect and prevent fraud at a level where most of the present technologies fail. Crimes could be prevented before they happen. Art, music, screenplays, videos, etc., could be created by anyone. An Oscar or an Emmy, or a Golden Globe grade material could be produced by a wannabe. These are just a few use cases to tickle your mind, and it’s only the tip of the iceberg.

Transfer data, big data and internet of things (IoT) concept
Source: Adobe Stock

Finally, a parting thought – Artificial General Intelligence (AGI) would be the greatest leveler of all times, bringing AI and related technology to the average person, who can use it without needing to learn it or know how it works behind the scenes. This coupled with 5G or even a 6G, would make for a future where we all are equal, and no one starts with an unfair advantage. And quantum computing would have unlocked our machines’ potential to take a leap like none before.

All this combined will help humans make progress in ways that are unthinkable. TAs an example, all the progress that science and technology have made in the history of time thus far, AGI+5/6G+Quantum computing will be able to cover the same in just a matter of seconds.

Get in Touch​

Drop us a message and one of our Fulcrum team will get back to you within one working day.​

    Get in Touch​

    Drop us a message and one of our Fulcrum team will get back to you within one working day.​