In https://discuss.flarum.org/d/26525-rfc-flarum-cli-alpha, we announced the alpha stage of Flarum CLI. We envision this as a CLI tool that will generate boilerplate, update extensions, perform auditing/validation, and all sorts of other cool stuff that makes development easier. So far, we've effectively built out a "proof of concept" for the major categories of functionality:
- boilerplate creation
- code generation
- updating for changes in Flarum
- updating extension infrastructure
(sorry audit, you're just too complicated for now. Someday 😥)
Now that this is working, we need to start expanding functionality, especially when it comes to code generation. We're planning to support generation of everything from JS components to backend models to serializers, extenders, migrations, test cases, and practically everything else you might need in Flarum.
Problem is, this gets complicated fast. Just look at the current code for generating event listeners. It's a confusing, jumbled mess. If we copy-paste this for all the different stuff we want to support generating, it'll be an absolute pain to maintain. And what if we want to support more complex scenarios?
- Adding extenders to extend.php, but not boilerplate stub files
- It'd be cool if, when generating a model, you'd also get the option to generate API controllers for that model. And a serializer. And maybe a validator too? If we just code that up naively, that's going to be a thousand-line-long file. No one will EVER want to touch it.
So in this discussion, I'll give an overview of how the Flarum CLI is currently designed, and how/why we might want to implement code generation.
Current Design
Background
Flarum CLI is the planned successor to the FoF Extension Generator. The old generator was relatively simple: it prompted the user for various config, copied over an extension skeleton from a template, filled in variables from the provided config, and wrote the new files into your filesystem. Hooray, new extension!
CLI Design Overview
The CLI itself is built around the oclif framework. This takes away all the boilerplate work of making the CLI, well, a CLI. Subcommands are nested in folders, and each command corresponds to a file, which exports a class extending the oclif-provided Command
class. Simple enough.
Anyways, back to structure. Most commands works in the following process:
- Confirm that we're in a Flarum extension (unless we're generating a new extension), and get the current directory.
- Prompt the user for whatever config data we need
- Create or modify files to accomplish the command's goal using the provided config. This is done in an in-memory filesystem so that if something breaks, we don't create a bunch of unneeded stuff.
- Once everything is done, commit changes to the filesystem. Congrats, you now have a new... something!
File Creation and the PHP Subsystem
Let's explore step (3) a bit deeper. There's currently 3 general "types" of file changes/modifications we do.
- Read files, look for simple patterns via regex, modify them, and write back to the file. This is what we do for the
update js-imports
command.
- Copy in a boilerplate file, replace variables with values provided via config. This is what we do when we generate migrations, or initialize a new extension.
- Modify an existing file to add/change non-trivial code. This is what we do when automatically adding extenders to
extend.php
.
1 and 2 need no explanation, but 3 sounds pretty complex. Well, luckily for us, there's an awesome library that allows parsing PHP code into an AST, making modifications, and turning it back into PHP, with relatively minimal changes to code style. Yay! Unluckily, it's in PHP itself, so our use of it will also need to be in PHP. But the CLI itself is in JS/TS. We solve this by including a mini PHP package as a subdirectory of the CLI source code, and calling that from JS via the child_process
node library. Essentially, we call the PHP code from our JS code.
Improvement Proposal
Currently, business logic of commands (ie the part where files actually get moved around and stuff) is located directly in the commands. We try to DRY some stuff by using a custom subclass for commands, but that's a weak solutionThis is great for the alpha version, as we just write the code we need, but as I explained in the beginning, this doesn't scale well. For that reason, I think we need to separate business logic out into a new layer. For now, this layer should focus on modification types (2) and (3).
I think a class with a fluent API would be a good candidate here:
(new GenerationTool(rootDir, currentDir)
.copyBoilerplateDirectory("extension/*", "$CWD")
.copyBoilerplateFile("extension/.github/workflows/test.yml", "$EXT/.github/workflows/test.yml")
.copyStub(Generation.Migration, "$EXT/migrations")
.copyStub(Generation.Backend.EventListener)
.addExtender(Extenders.Event.Listen)
.updateComposerJsonFieldFromInfra('scripts.*')
.updatePackageJsonFieldFromInfra('scripts.format')
.promptForMissingConfig(function (missingConfig) {
return prompts(missingConfig);
})
.execute()
$CWD is the current directory, $EXT is the extension root. We should also have a magic $BEST_DIR, which would be either the user-provided directory (if explicitly provided) or a "best practices" directory as defined in the schema (see below). For example, if you're generating an event listener and don't say where to put it, we should put it in the src/Listener
directory. This will make development easier for new devs by eliminating decision making about folder and file structure.
Generation
and Extenders
could be some form of enums, acting as identifiers for a "Schema" system. This would allow defining "schemas" for extenders and stubs. The purpose would be two-fold:
- The schemas for extenders would be populated with user-provided config, and turned into
Extender Specs
, which is how we tell the PHP system what kinds of extenders to add. This way, we could define a single schema for use in potentially many different generation commands, instead of manually assembling the extension spec directly in the command logic ewwwwww....
- We could figure out which user-provided config is needed, what types/validation it should be, and generate a list of missing config.
The part I'm least sure about is providing config values. Should we do it in each step, or all at once at the end? Or both? All at once at the end has the benefit that we can provide some values, and the generation tool will check which ones are missing. Then we can prompt for those. On the other hand, we'd need to namespace them to prevent conflicts. I think the solution here will become apparent as we try out various approaches. This is all private API for internal consumption so BC is not a concern.