Does Generative AI Dream of Building Apps? The Disruptive Potential of AI Enabled Software Engineering

By Terry Room

Like many, I have been keenly watching the recent developments in AI, with the advent of next generation Large Language Models like Chat GPT. I recently attended a CTO round table event where we discussed the transformative power of AI. One domain we centred on was software engineering and the disruptive potential o of AI assistive tools such as Git Co-Pilot and AWS Code Whisperer.

Some of the questions which drove the debate:

· How will these tools make the creation and support of software more efficient?

· Will we need less developers in the future?

· Will the lines between no-code, low code and custom code apps continue to blur?

· What governance should be in play, and what risks must we manage?

Here are some reflections and follow-on thoughts.

The Magic of Software

Arthur C Clarke, author and futurists famously wrote:

“Any sufficiently advanced technology is indistinguishable from magic.”

Initial interactions with Large Language Models (LLMs) such as GPT can feel magical – a far cry from Googling and trawling through links to attempt to stitch together the required information to solve a particular problem. . The ‘magic’ taps deeply into our consciousness, into the natural language processing areas of our brain, the core of our cognition and how we communicate with the outside world. Interaction with ‘talking machines’ which seem indistinguishable from a human being has simultaneously and rapidly captured our collective imagination and concern.

As a sense check of the state of the art, I think back to the ‘rise of natural language interfaces and BOTs a few years ago. In some cases a useful and valuable means of enabling channel shift and accessibility – in others an added source of digital frustration. The believability of the new generation of language models is on a whole new level of credibility by comparison.

For Software Engineering, seeing a machine generate code can feel even more magical. Especially when compared to the practices of the founders and pioneers of electronic computing (when bugs were actual bugs!). Prompt based’ engineering approaches stand in stark contrast to punching holes in cards and then having to wait a day to be scheduled to see if your program ran without errors.

But the hype around generative AI when applied to app creation requires closer inspection. And a good place to start is to look at some of the current offerings available.

GitHub Co-Pilot

Co-pilot provides code suggestions based on prompts and code context. Recent updates have filtered insecure recommendations such as hard coded credentials, and sql injection vulnerabilities. Github’s research claims that the recommended code acceptance rate has increased from 27% to 35% with these changes (with variance by language) The brand positioning is also worth taking note of i.e. It is a ‘Co-pilot’ and not an ‘Auto-Pilot’ (more on this later).

CodeGPT is an interesting Visual Studio Code add in, which has a rich set of generative features such as:

Get Code (generates code based on natural language),
Ask Code (ask any generative question e.g. ‘write me C# which validates an email address using a regex),
AskCodeSelected (ask generative question of any selected code e.g. ‘Generate .md file with markdown’ or ‘convert this to python)
Explain
Refactor (refactors code, with recommendations to reduce dependencies, improve readability etc).
Document

Interestingly it plugs into a number of LLMs (Open AI, Cohere, AI21, Anthropic).

Visual Studio Intelicode

This is the AI enabled next gen of Intelisense. Intelicode prompts are based on context of the code you are writing, and not just an indexed list of overloads. It also has features for addressing stylistic consistency across your code in an effective manner. And will even flag potential typos, like mistakes in variable use. Also, of note is that it runs locally on your dev stack and does not back off to cloud hosted APIs, a key requirement for highly secure code creation processes in regulated industries such as Financial Services and Health.

Amazon Code-Whisper

Similar in intent to Co-Pilot, providing contextual code complete suggestions.

CodePal

https://codepal.ai/

An interesting service which offers various code generators, by language, as well as translators, unit test writers, query writers, schema resolvers and security analysers.

We’ve also seen generative AI being included in no code/low code apps. Microsoft PowerApps for example provides an OpenAi powered natural language interface (‘create a workflow….’, ‘build a table…’) to take even more of the toil away from building this type of app further blurring the boundaries between no-code, low-code and custom.

Evolution vs Revolution

A little perspective can be helpful sometimes.

Programming has evolved considerably from the first days of electronic computing to harness increasing powerful hardware and has been applied to create an increasingly complex array of applications. We’ve gone from punch card mainframe systems to highly complex distributed systems.

The evolution of the tools, skills, processes and practices for the successful creation of these systems had had to keep pace. In fact, it can be argued it has set the pace. The innovations in tooling and practices have had to come hand and hand and provided the ability to exploit increasing computing power.

We evolved to client server, to cloud, and from monoliths to distributed API enabled microservice architectures.

The modern developer is massively enabled compared to the early pioneers – frameworks and languages, inteli-sense tools, static analysis and linting tools, managed run times, increasingly safe compilers, increasingly powerful code collaboration platforms, security analysis, automated build and release tools, branch management tools. Yet the demands on the modern developer have increased proportionally – Faster, higher quality with fewer bugs, more secure against increasing cyber threats. And as an overall trend, the last 5 years we have seen the increase in that demand accelerate, fueled by mass adoption of cloud computing and the digital transformation of many industries.

So is generative AI a gamechanger, or a next natural step in computing tradition of the last 50 years?

Productivity

It is also necessary to ask, so where (and how) does ‘generative software development’ play in the end to end value chain? I would start with – If you are looking for productivity gains you should start elsewhere! There is still a lot of productivity upside opportunity in many of today’s Enterprises through a right sized approach to standardization of architecture patterns, tools, platforms, templates, libraries and methodologies. Furthermore, it is important to look holistically and to frame the problem that actually needs solving. i.e. .

‘How we can deliver digital products which underpin our business strategy faster and at higher quality, whilst robustly secure.?

Co-pilot not Auto-pilot

In his best-selling book Outliers, Malcom Gladwell cited the safety transformation story of Korean Airlines. Central to this transformation was not technology, but dealing with a cultural legacy – in particular an inbuilt deference to one’s superiors, such as first officers being deferential to the captain, even when it was clearly obvious that the captain was about to make a catastrophic error. Transforming, from one of the worst safety records to one of the best was achieved by dealing with this cultural legacy – by training flight staff in clear and concise communication, and empowering staff (and making it an imperative) to validate and challenge each other.

Similarly, when we consider driverless cars, there are many legal and ethical issues to address before vehicles ever become ‘totally driverless’. Whilst assistive driving technology can improve safety and has potential to improve road safety on aggregate, there remains the gnarly question of ‘who is responsible’ in the event of an accident? The driver, the other road user, the software, the model, the hardware? Are we going to outsource this to our AI lawyers and AI insurance underwriters? Of course not, ultimately the driver must still be accountable.

As a side note, on the state of the art in driverless tech, Ford’s system has just been approved for use on UK motorways. It monitors the driver whilst providing a ‘hands off’ driving experience. Think ‘next gen cruise control’ rather than fully driverless.

Ford launches hands-free driving on UK motorways – BBC News

The key point here is that AI should be considered assistive, and never in full in control. Even when the AI takes increasing levels of control, they system should still have failsafes built in.

The same mindset should apply to the code you create. As the creator you own the app – intent, features, architecture, technology building blocks, non-functional characteristics. A point made clear on the landing page of Github Co-pilot. Any AI enabled app architecture must fail safe

All Apps != All Code

Another perspective of assistive AI software generation needs consideration. In the technology industry we have a tendency to talk about ‘code’ in a somewhat generic sense. To the lay person, code is code. But all apps are not equal. And all code is note equal. i

For illustration, consider a payment processing engine, or a software control system for a power station. Both have non-functional characteristics such as extreme availability, robust resiliency features such as idempotency and transaction management, and stringent security controls to protect processes and data from many threat vectors. The consequences of failure of such systems can be severe.

By comparison consider a field service app, maybe one enabled by a no-code, low-code app platform, where forms, processes and data storage are generated, and appropriate security and data controls are built in

The end to end process for the construction of these apps is vastly different. Like comparing the creation of a 12-bar blues to a symphony. Yes, they are both still music, in the same way that the software in all systems is still code. The implication is that the mileage of generative software development will and should therefore vary based on the type of app or system that you are developing.

Maintaining the craft of building apps

But there is more to it than this. The actual creation of code is just one part of the process of constructing digital products, albeit an important one. But what about architecture (enterprise, solution, app, system, service, data), what about a security model based on threat models and regulatory compliance needs? More fundamentally generative AI will not identify the need for an app or platform – what needs does it service, what value does it create, what investment is required? Nor will it manage the complexity of delivery and execution – which processes should we use, what does the team structure look like, what quality and assurance controls should we apply, how should it be operated and (most importantly), how will we manage people (strengths, weaknesses, communication ,culture, aspirations, hopes, fears)?

It is safe to assume that generative AI will take away some of the toil of the code creation process, allowing the developer to spend their time on higher order and higher value tasks. But we must still endeavour to maintain the craft of building software and not outsource it all to the machines. This is because we still need to know what good looks like. We need to know what secure code looks like. We need to know whether the architecture of the system being created is appropriate and fit for purpose (with appropriate failsafe mechanisms built in). And of course we need to guard against ‘AI hallucinations’. Fundamentally, we should not sleep walk into the enablement of our developer androids without the right controls being firmly in place. Software that generates software (that generates software!?). We must continue to own the craft of creation. Anything else would be denigration of responsibility.

AI Enabled Development Futures

It is clear that generative AI has high potential to offer efficiency gains and increase developer productivity, and to improve the developer experience.

Whether will we need less (or more) developers is impossible to predict accurately (like most predictions to complex questions!), but it is clear that the developer experience will continue to evolve at pace. The opportunity is there to harness these productivity gains to create better software, and at greater pace. Will we see the rise of the Chief Prompt Engineer? Maybe. But code generation is just part of the story. If our ‘AI assistant developers’ takes away some of the toil from repetitive coding, we can focus on the creation of a new classes of distributed systems and apps to help solve the prescient issues of our time (sustainability, the environment, food poverty, the health of an increasing and aging population) , backed by emergent capabilities such as machine learning and quantum (which your deep learning model will probably not be able to help you with btw!). We can build apps more effectively, where cost and value are more in line, and where risks of delivery and operation are significantly reduced, even against an increased landscape of cyber threats.

But there are many issues to address, and sooner rather than later. It took many (many!) years for the various policy and regulatory frameworks to govern the web (some would argue even this is by no means done yet). This time governance needs to move faster, and to keep pace with the innovation in technology.

Co-pilot not auto-pilot’ practices should be mandated and need to not just be a safe practice, but one which is backed practically, such as with “generative transparency” tools (where generated parts of a code based are clearly tagged, and therefore validated and tested as such), as well as by appropriate policy controls and regulation. What are the (Hippocratic) responsibilities of the developer (and architect) here? And intellectual property and copyright issues need greater clarity.

Sustainability needs to be addressed (training LLMs uses a lot of compute power, and Moore’s Law may be on the wane!).

Models will need to be diversified, to align with the compliance needs of different industries. And the compliance needs of different industries will need to be redefined.

Does the current generation of generative AI technology dream of building apps? Maybe, but the extent to which we allow that is entirely up to us. We need to maintain control.

Room is an experienced technology leader and architect with many years of experience helping large organizations across industries such as Finance, Energy, Manufacturing, Public Sector achieve value through technology investment, including the architecture, design and delivery of mission critical platforms. More recently focused on the capture of value in hyperscale public cloud platforms supporting digital strategy creation, and technology architecture and delivery.