Monday, September 20, 2010

High Level Logic: Rethinking Software Reuse in the 21st Century

Introduction

An application programmer spends six months perfecting a set of components commonly needed in the large company that employs him. Some of the components were particularly tricky and key pieces required very high quality, reliable and complex exception handling. It has all been tuned to run quickly and efficiently. Thorough testing has demonstrated his success. Part of his idea of “perfection” was to build in a way that the software, even many of the individual components, could easily be reused. But it is surprisingly likely that no one outside of a small group within the project will ever hear of it.

Tens of thousands of programmers building thousands of applications, repeatedly building the same functionality over and over again. A collective nightmare for CFOs. There are those who believe that the problem has been addressed with “best practice” object oriented programming techniques, enterprise frameworks, and strategic development of specialized open source systems. Certainly they are right – up to a point.

While many available tools and good programming technique offer opportunities to reuse software, and most definitely reduce the need to rebuild many “lower level” (relatively speaking) functions, they also facilitate development of much more complex systems and provide a plethora of gizmos for doing old things in new ways, producing a never-ending stream of reasons to update what has already been done. Out there on the edge, where application programming actually takes place, the software reuse problem is a moving target.

Generally speaking, the benefits of software reuse far outweigh the costs. [1] But in the messy world of real-world application development, the challenge can be complex. Many managers value rapid prototypers over “best practice” engineers, not understanding that building on somewhat sloppy early results will typically and dramatically increase project time and cost and reduce quality. In larger organizations with shared interests, they wonder which project should bear the increased cost of building reusable software components? Who pays the cost of searching the (sometimes huge, distributed, and insufficiently documented) code-base to look for possible matches? Should a software project focus time and money on special packaging, documentation, and “marketing” material to promote reuse of components it builds?

I believe it is possible to realign the software development process in a way that will make everyone happy; from the executives who will see measurable improvements in productivity, to project managers pushing for rapid results, to the programmers who fantasize about widespread use of their best work, to the CFOs who see the company mission fulfilled on a smaller software maintenance budget.

Such a dramatic statement needs a theatrical follow-up. In the spirit of The Graduate, I just want to say one word to you – just one word. Are you listening?

Configuration.

Exactly what do I mean by that? There is a great future in configuration. Think about it. Will you think about it? … Shh! Enough said. That's a deal.

OK, it's not actually enough said in this context. I'll get back to “configuration” below. What I want you to think about first, really think about, is that this is the age of software components.

In the distant past, it was easy to see that it would be useful to represent often repeated binary sequences in hexadecimal code, then an obvious step to package sections of code into a machine language to handle common operations at a higher level. Then combine those into commonly used functions. It's been almost a half century since we got a little excited about “structured programming.” We built functions and libraries, and once again noticed that program structure and flow as well as development tasks often had commonalities across applications. Special tools emerged. It has been thirty-five years since the first IDE was created.

Believe it or not, it has been a half century since an object oriented programming language with classes and instances of objects, as well as subclasses, virtual methods, co-routines, and discrete event simulation emerged from research in artificial intelligence (Simula 67). C with Classes was renamed C++ in 1983 and developers quickly replaced their C compilers with C/C++ compilers. The Java Language Project was initiated in 1991 and Sun Microsystems offered the first “Write Once, Run Anywhere” public implementation in 1995 and the first release of the Java Enterprise Edition in 1999. This is the age of software components. But even one decade is a very long time in the software world. One might almost expect that something new is about to happen.

Beyond Guestbooks and Smiley Faces

If you're entrepreneurial, perhaps you have already realized that you could package sets of useful components as finished product suites (components = products). If you are an independent consultant or operate a specialized software development company, you can offer high quality services based on proven technology with your own product suite(s). (Shame on you if you don't already.)

But let's say that you want to build a complete system, for some purpose, that does something besides impress people with its quality and reusability in the development process – an application. Adaptation by configuration is pervasive. Here are some examples.

Word processing software serves a very specialized purpose. It is adaptable by configuration. You can install specialized language support, adjust the characteristics of various text blocks, add templates for complete (reusable) document layouts, and even adapt it for use by the visually impaired. Some word processing systems are also extensible.

Lotus Notes has a history that reaches back into the 1970s (PLATO Notes). It is familiar to many software developers (and others) as an interactive, networked system that is adaptable (by configuration) to the specifics of a project or other activity, and also extensible. This is a bit more general than a word processor, providing a suite of services, but still somewhat specialized. IBM offers both extensions and tools. Custom code can in fact be added to extend the capabilities of the out-of-the-box system. Extending the concept, Lotus Sametime is offered as middleware for building custom networked applications.

WordPress “is web software you can use to create a beautiful website or blog,” says the WordPress website. “We like to say that WordPress is both free and priceless at the same time.”
The core software is built by hundreds of community volunteers, and when you’re ready for more there are thousands of plug-ins and themes available to transform your site into almost anything you can imagine. Over 25 million people have chosen WordPress to power the place on the web they call “home”.
People all over the world build and post their own components. It doesn't take a software professional or layers of bureaucracy to select and add powerful new interactive features (beyond guestbooks and smiley faces) to customize websites. Welcome to the 21st century (and pervasive CMS)!

The Brave New World

What if you could do that with all software development? And what if a major portion of the reusable software components in a company, starting with their design, were treated seriously as independent internal products rather than vaguely outlined portions of a large pile of 1s and 0s? The idea might be much more practical than you think.

The shift to object oriented programming changed the way programmers think about creating systems. Components are what systems are made of these days. This major technological paradigm shift has also had a major impact on project process; which now leans toward “lean,” discrete, and agile. [5]

Some of the most complex and potentially expensive aspects of software reuse involve getting all the software organized, identified, documented, and searchable. But consider what is already inherent in the tools and processes commonly used by modern software engineers. Software objects are arranged in packages. In best practice, naming of packages and components is systematic and aims to make functional purpose easy to recognize. Package identifiers can also serve to direct programs to the physical location of each component. Documentation on individual objects, arranged in packages, can be automatically generated (easy to organize and keep up-to-date).

Best software practices encourage reusability. If I'm creating an application that reads and interprets 25 XML files, it only makes sense to write one (only one) general purpose read and parse component for XML files, so long as that is possible, rather than new code for each file. Only that part which must be treated uniquely requires additional code.

My experienced observation is that much of the time, in common practice, building nifty general purpose code is less expensive than building sloppy spaghetti code. Building good code from the start dramatically decreases project time and cost. There will be fewer avoidable complexities in downstream development, fewer bugs, and consistently higher quality. Consider also that experience matters. Developers who spend their careers building good code, not only prefer doing the job right, but become extremely efficient at it. When they design, what will come to their minds is good code, not junk. When they implement the design, they are quite familiar with the techniques and details needed to build good code.

From a variety of perspectives, developing reusable components in the spirit of discrete products is beneficial, and the time is right. What more is needed then, to maximize the benefits of software reuse?

Regular Stuff + Artificial Intelligence = Something

The Java language and frameworks like Java EE continue the development path that started with binary sequences in the first section of this article. They differ in that one does not generally innovate on the concept of adding two integers, for example. Initially, getting good fast versions of common functionality for a variety of machines was the point. Both Java SE and Java EE (and others) provide support for higher level functionality supporting, for example, a variety of ways to move data around on a network for display and processing.

In the world of artificial intelligence research however, it seems people enjoy branching off in new directions, “moving things around” (so to speak) to change the character of computing. The old research definition for AI was simply to get computers to do things that at present, humans do better. From the start, people thought about moving the human role (part of it anyway) into the computer.

In the mid to late 1980s, complex rule-processing software came on the market. New companies emerged marketing “expert systems” tools, large companies invested, and more powerful rule-processing capabilities were added to existing commercial products like database systems. A slightly deeper look yields something more interesting than a packaged way to process lots of “if-then” statements. AI researchers wanted to move logic out of application programs and into a general processing engine, with application logic treated as data. I'm going to cast that into the perspective I offer now, with a description that the researchers and developers at that time may never have used. Expert systems applications were built by configuring the processing engine with a rule base (and other information).

More powerful systems like KEE became commercially available in the same decade, incorporating a new and powerful programming component - objects - into the mix. The object oriented programming concept itself repackaged some of the common elements of complete applications into individual components; not just by definition, but by encouraged design practice. Its introduction was disruptive, setting vast numbers of working engineers to the task of rethinking how software systems should be written. An “object” you say? Sounds classy!

My agent is on the phone.

“A software agent is a piece of software that acts for a user or other program in a relationship of agency,” says Wikipedia (citing two sources [2][3]). Agent technology also has a history. The concept can be traced back to 1973 [4]. An actor is "a self-contained, interactive and concurrently-executing object, possessing internal state and communication capability.” One might call it the ultimate object.

Agent technology has already emerged from artificial intelligence laboratories. Modern agents extend the “Write Once, Run Anywhere” idea, even to the extent that there are what might be called “door-to-door” salesman varieties; traveling agents (also called robots, bots, web spiders and crawlers and even viruses) that move around the Internet (sometimes duplicating themselves) to perform tasks.

The telecommunications industry recognizes the importance of a new technology that screams to be used as a central processing system for a wide range of applications that can service the wide range of networked devices available today. JADE (Java Agent DEvelopment Framework) is a free software Framework distributed by Telecom Italia, that “simplifies the implementation of multi-agent systems.”

It changes the way you think about software development. Don't worry about the wide range of interfaces needed for so many devices. They're supported. Don't worry about the complexities of communication. The code is written and maintained by someone else. This goes beyond the relatively “low level” programming components available in IDEs and support offered by higher level development frameworks like Java EE. Much of “the system” already exists. Just focus on the very specialized components needed for your particular application that can be cast into the agent framework. Only that part which must be treated uniquely requires additional code.

You then let the framework know when your agents are needed. When they are, they get the call; automatically. And by the way; intelligent agents can sense when they are needed and respond appropriately, even learn and adapt to new circumstances.

Sometimes one is not enough. A multi-agent system (MAS) is a system composed of multiple interacting intelligent agents. Multi-agent systems can be used to solve problems which are difficult or impossible for an individual agent or monolithic system to solve. Examples of problems which are appropriate to multi-agent systems research include online trading, disaster response, and modeling social structures.

High Level Logic

OK, quick! Think of a way to go beyond what's been discussed so far.

How about this?
  • Design what is basically an agent-type system that easily interacts with others of its own kind installed anywhere in the world and may also communicate with other agent systems facilitated by use of standard message structures. (Note that the FIPA standard message offers SOAP level power.)
  • Identify high level logical processes that you expect to be common to thousands of applications (maybe every conceivable application?) and provide generic engines for doing that part of the work; both in the spirit of “expert systems” above – i.e. moving additional, common, higher level application logic to a generic engine for the first time and embellishing on the new generic processor and also support more basic processes like intelligent XML, encryption, authentication, rule-processing ....
  • Provide a “general problem solving” (high level process … what do you want, what are the variables … etc.) engine.
  • Include the ability to carry out more complex processes by implementing custom plans.
  • Devise an even higher level generic processor that ties all of the above together in a top-level process.
  • Allow application developers to extend and customize if they wish, without destroying the ability of one system to interact, cooperate, and share with the others.
  • Provide a simple way to tie in a GUI – as needed – for a great variety of devices.
One more thing, while we're on the subject of code reuse.
  • Provide a mechanism to call upon any level of reusable code, from large systems to discrete components integrated through plans, rule-processors and other high level structures. Support options for utilizing common code remotely (if, for example, the process utilizes other remote resources), to fetch and utilize temporarily, and to fetch and store locally.
Now what you have is an outline for a system known as “High Level Logic” - HLL. The High Level Logic (HLL) Open Source Project stems from a history that goes back into the 1980s, as a concept for unleashing the power of expert systems. Prototype software was more recently built as part of a larger intelligent robotics project. The commitment to open-source development was made in July, 2010.

Although the development project is now (September 2010) at an early stage, there is sufficient software available to create applications. The plans to complete the first complete “light-weight” version, using only Java SE, that will have the full set of characteristics described above are quite concrete, already committed to specific technical issues logged in the project's issues tracker. An outline for a somewhat heavier version using Java EE components is given on the project description page.

Yet Another Path to Code Reuse

Subtly perhaps, four different approaches to code reuse have been mentioned and illustrated in this article.
First, the development of higher-level languages involved assigning names to commonly used bits of logic in lower level coding and the development of tools to translate (interpreters, compilers, …) back into lower level code. One common implementation of the lower level code was then used by everyone using the same higher level language (compiler …).

Second, object-oriented computing involved re-organizing some of the processes common to applications from the complete application level, down to component level.

Third, a more open mass use of certain web-based technologies led to common application cores and shared extensions. (Proprietary: Lotus Notes → Open: WordPress; and also more on the extreme techie side, consider the history of Sun Java development).

Fourth, highly innovative research identified distributed application logic that could be extracted into generic processing engines.
At least one more exists, which will be the subject of later articles. Learning and adaptive software has already reached well into the stage of commercial use. Developers write code explaining the results they want. The learning system automatically creates the code. There are many circumstances in which the same specification code written by developers can be reused on various platforms (physically different robots for example) in different environments and for different purposes (identifying and moving different objects in new environments for example). Even existing application code can be reused and automatically adapted to differences.

The direction of HLL incorporates all of the above providing a useful general purpose application platform rather than a specialized application platform (like CMS for example). It will be part of the purpose of the HLL Blog to provide further focus on these characteristics of the HLL system.

Given the current state of HLL, there is at least one characteristic that should be emphasized. Application developers focus their coding work only on those components that are unique to their application. There is a great deal of flexibility in what can be done on the application, because – simply – there are no restrictions. Java developers, for example, can write any kind of Java components they wish. The HLL engine can use components from anywhere an HLL system exits on any installed HLL system (component sharing). Components that builders wish to be accessed by HLL are named (like higher level language), and accessible to HLL systems through their configurations.

This aspect of HLL is worth emphasizing. It is the intent, that – especially as an organization builds its library of reusable functionality – application development will be largely a matter of configuration; and that's a very good reason to push reusable code development.

References:
  1. SOFTWARE REUSE ECONOMICS: COST-BENEFIT ANALYSIS ON A LARGE-SCALE ADA PROJECT, Johan Margono and Thomas E. Rhoads, Computer Sciences Corporation, System Sciences Division. (I believe this 1992 article in ACM, which I found freely available on the Internet, is still quite relevant. Margono and Rhoads however, did not say “benefits … far outweigh the costs.” They actually said; “benefits … have so far outweighed the costs. We believe that this will continue to be the case.” Eighteen years later, with a great variety of new advantages mentioned in this current article, it is surely even more true due to long-term technical focus on the issue; and this article recasts the issue in that light. We've come a long way.)
  2. Nwana, H.S. 1996. Software Agents: An Overview. Knowledge Engineering Review, Vol.11, No.3, 205-244, Cambridge University Press
  3. Schermer, B.W., Software agents, surveillance, and the right to privacy: A legislative framework for agent-enabled surveillance. Leiden University Press, 2007, p.140.
  4. Carl Hewitt; Peter Bishop and Richard Steiger (1973). A Universal Modular Actor Formalism for Artificial Intelligence. IJCAI.
  5. Agile software development methods Review and analysis, Pekka Abrahamsson, Outi Salo, Jussi Ronkainen, and Juhani Warsta, Espoo 2002, VTT Publications 478, 109 p.

No comments:

Post a Comment