Powerful Editor Abstractions

Not all editing tools are created equal.

Combining several simple ideas into one compound one, and thus all complex ideas are made - John Locke, An Essay Concerning Human Understanding (1690)

While many words have been spent on the Holy Editor Flame Wars, I’ve read very little that attempts to categorize what they actually do. Likewise, everyone pays homage to the concept of using the “best tool for the job”, but few talk about when each tool would be best.

Our editors provide abstractions classified into three main categories:

Semantic Tools
Run-time Inspection Tools
Text-based Tools

Semantic Tools

Five abstractions are the core of Semantic Tools. If we had nothing else but these, we would have quite a bit indeed! These make a huge program much more manageable.

Language Errors - Indicate code that violates the rules of the language
Find All References - See a list of all usages of a field, function, or class
Rename Symbol - Rename the current field, function, or class
Auto-complete - Show a list of possible symbols to complete section, ideally with documentation
Go To Definition - Move editor to the symbol’s defined location

These actions allow us to interact with the Abstract Syntax Tree (AST) of the codebase. Unfortunately, building a correct AST before run-time is not always possible. Building an always-accurate AST is impossible in any language with reflection, or weak types, or dynamic types, or those with syntactic macros.

My first encounter with the AST problem came when a manager asked me to rename all uses of Contact.Id to Contact.ContactId in a big PHP project. I renamed all I could find using sed and grep. I ran the program and encountered hundreds of run-time errors. What did I miss? Someone had stored the string “Id” in the database which was read out and combined with PHP magic to access the Id field on my class! No refactoring tool could have possibly detected that. That sort of reflection makes it impossible for any tool to build a completely correct AST.

Even flagship IDE languages like Java and C# still have a version of this problem. Sharing dlls or jars to another project breaks semantic AST tools. A Rename Symbol will only detect and modify usages in the current open project, not every consumer of the produced binary. Reflection and explicit casting also breaks semantic tools. Depending on your environment and what you do with binaries, this ranges from a minor nuisance to a major inconvenience.

In the last decade, several IDE’s have added plugins that can build an AST from PHP, Python, Ruby, JavaScript, Clojure, etc by using raw text parsing or through a modification to the interpreter. While incomplete by nature, these at least provide some modest functionality as long as the programmer is aware of their limitations.

Run-time Inspection Tools (Debuggers, REPLs)

These tools allow the programmer to execute the code, and inspect what is happening. They are typically integrated with the language itself, either provided directly or through a library.

Stop at Breakpoint - Run code to a point and stop execution there
Inspect Call Stack - See the function calls that led to a breakpoint
Inspect scope - See all variables in scope at a breakpoint
Execute statement at point - Run any arbitrary code in the scope of the current breakpoint
Modify the executing code - Modify variables and functions dynamically in the running code

Run-time inspection tools are excellent for seeing how a program is executing. They allow the programmer direct insight into a running program. There have been many times when I’ve been completed stumped with a program, only to run it with a debugger and suddenly realize what was happening. If you have access to these tools, they are worth learning well.

Debuggers are most useful if you can recreate the scenario you need to inspect. Debuggers will not help for debugging an issue unless you cannot completely recreate the exact scenario you need. For this reason, unit tests combined with a debugger are a powerful combo.

Many REPLs (Read Eval Print Loop) allow you to modify the code live while connected to the running process. Languages like Common Lisp, Smalltalk, and Clojure are famous for allowing the programmer to replace blocks of code on a running server: sometimes even in production! Integrated REPLs are an extremely powerful tool, and powerful tools can be quite dangerous. Knowing how to use a tool well is the first step in knowing how to use it wisely.

Text-Based Tools

A text-based tool operates on raw text only. It can be limited to a single file, or many files. Text tools do not know anything about the AST or structure, they only operate on raw text.

Find/Replace - Search for text and replace it
Regular Expressions - Transform text based on pattern matching rules
Record/Playback Macro - Record and playback a series of operations
Balance parenthesis/brackets/quotes - Transform balanced sections
Linting/Style enforcement - Show places code doesn’t match a preferred style, optionally fix style warnings

Consider the common Semantic transformation Extract Interface. It takes a class and generates an interface next to the class containing all the public functions from the class. The refactoring does not require an AST to work, it can be easily achieved by combining several text commands. See example in my post Vim Refactoring Patterns

Any refactoring can be replaced with a text macro or regular expression. Here is where we see the power of good abstractions. Where an IDE refactoring suite may provide dozens or hundreds of specialized commands that only work in a single language, a few good text abstractions compose endlessly in any text files.

A skilled user of these basic abstractions can solve any text-based refactoring in only a few steps. They can invent new refactorings, solving any text manipulation they need.

Become a master of Regular Expressions or Macros with practice exercises to teach you how to think using these tools, and identify places where they are best used: 10 Minute Vim

Balance Parenthesis

This set of tools is not very well known, despite being very useful in the right context. The best example of it is: ParEdit

The tool by default prevents the programmer from entering unbalanced characters, but then offers a suite of commands for transforming them.

ParEdit is less useful in languages with relatively fewer balanced characters, and most useful in languages where balanced characters contain an entire expression:

Balanced Characters	Languages
Fewer	Python, Ruby, OCaml/F#, Haskell
More	C#, Java, Javascript, C++
Surround Entire Expressions	Clojure, Scheme, Racket

If you regularly use a language in the bottom two rows, you might want to check out if your editor has an implementation of ParEdit!

Conclusion - Have all tools at your disposal

The real power comes when you can combine Semantic, run-time, and text-based tools to solve different problems.

For this reason, if you have the ability, you should learn one from each category of tools available to you.

Semantic Tools
- Intelij IDEA, PyCharm, VS Code, Visual Studio, Eclipse
Run-time Inspection Tools
- Debuggers in Intelij IDEA, PyCharm, VS Code, Visual Studio, Eclipse
- REPL (Python, Ruby, Clojure, Haskell, Javascript, Smalltalk)
Text-based Tools
- Vim / Emacs (even available an IDE plugin!)
- ParEdit
- Grep
- AWK

Mastering each category of tools will go a long way to increasing your ability to write and transform code efficiently.

Semantic Tools

Run-time Inspection Tools (Debuggers, REPLs)

Text-Based Tools

Balance Parenthesis

Conclusion - Have all tools at your disposal

steve shogren

my books:

posts for: