What's behind opcodes?

PHP cannot run by itself. It’s an abstraction meant for developers. Fortunately for us, some specific programs translate the human-readable code into machine-readable code.

It is what we call compiling.

The PHP process in short

We can learn from the documentation that there are four main steps: the lexing, the parsing, the compiling, and the runtime.

During the lexing part, your code gets scanned and tokenized. During the parsing part, tokens get reassembled to create meaningful expressions. Then, during the compiling, those meaningful expressions become instructions.

The machine ultimately executes those instructions, also called “opcodes” (operation codes).

From source code to machine code and beyond

During the lexing part (the first step), the lexer divides your code into individual identifiers. The parser then rearranges those identifiers to build the parse tree, an optimized hierarchical structure.

This tree allows the compiler to get some context to generate binary instructions to compute things in memory and execute what we need when we use assignments, loops, if statements, etc.

Our PHP code abstracts/emulates existing circuits in the processor. For example, if some input does not match the condition, an if statement means a jump instruction for the computer.

An executable program/code is a series of instructions for the CPU, written in binary (0 and 1), the machine code.

The processor receives those binary bits as electrical impulses. Those on/off variations trigger predefined behaviors, such as moving values in registers.

The PHP engine and the opcache

The PHP engine is also known as the Zend Engine. It’s the PHP virtual machine. It directly executes opcodes (instructions).

In PHP7, the compilation time is significantly longer than in PHP5, mainly because of an additional intermediary step, the abstract tree (AST). It’s longer but smarter. It generates better instructions (opcodes).

It will be even better with PHP8 and its JIT (just in time) compiler. Notwithstanding, generating opcodes requires a significant amount of operations, that is why opcache is excellent.

Every time you call your PHP script, the Zend Engine triggers all the operations we just saw. By enabling the opcache in the PHP configuration, you can drastically improve the overall performance. The engine skips all steps before runtime, including lexing, parsing, and compiling.

Wrap up

It’s not critical to know all these steps unless you want to understand how things work at a fundamental level.

That is just the purpose of any high-level language such as PHP: abstracting the low-level stage, and the compiler does the rest.

I’m just fascinated by compilers and how they changed everything. Some programmers wrote programs that allow other programmers to make their programs work on every computer without worrying about the extensive range of operating systems and processors.

Anyway, I hope you liked this attempt to make sense of opcodes.