One of the things that is tricky in these situations is you simultaneously want to change things and want to have them remain the same while you are changing the way it works.
I know I definitely don't want to make any changes to all the (dozens, at least) of source files I have. I want all of that to keep on working just as it has.
But I do know that the whole point of this is to make the configuration more modular, so I will ultimately have to change the configuration files so that they specify the modules in use. Part of me wants to do that up front, but the sensible part of me wants to do it last. So, for now, as I start to pull things out of the "main" code into modules, I am just going to "hardcode" the loading of the modules so that the configuration files do not change. Only at the last minute will I then quickly change all of the configuration files and strip out the hardcoded module definitions.
The ReadConfig class
I may have mentioned before that I write a lot of parsers. And in some sense, this project is a string of parsers connected in one way or another. But I would say that, in fact, it is more just a "text processor" at the moment. There is no abstract grammar involved - I just look at each line of text in context and do a switch based on the first token on the line.And, tempting though it might be, I'm not going to change that. Very simply because in making everything modular, I am making it less susceptible to an overarching model. That would promote consistency (which is always a good thing), but would also place arbitrary constraints on how the modules work and potentially on literally what they can do; and that is a bad thing. I don't know what every single module might want to do, so I will keep the interface as simple as possible.
The ReadConfig class itself, though, knows almost nothing about this. It just creates a ConfigParser class and pumps it the contents of the Place that it is passed in.
public Config read() throws ConfigException {
ConfigParser parser = new ConfigParser(universe, place.region());
place.lines(parser);
try {
return parser.config();
} catch (IOException ex) {
throw new ConfigException("Could not read configuration " + place + ": " + ex.toString());
} catch (Exception ex) {
throw WrappedException.wrap(ex);
}
}
The ConfigParser class
The ConfigParser as it currently stands has too many responsibilities: in reality, it should take the input lines and turn these into a set of actions and inform a listener that something has happened. Because the configuration is hierarchical, it should keep a hierarchy of listeners that reflect the level of nesting. At the various levels, some of the rules remain the same; others change.In fact, this class has the responsibility for creating the final configuration, as well as constructing all the objects from the parsed configuration. My first task is going to be to break these out. While I'm about this, I'm going to move all of the parsing code into a new package config.reader, and try and leave the actual configuration classes in the config package.
A Nasty Surprise
I was in the middle of reorganizing all of the configuration code, and was about halfway through, when suddently it asked me to authenticate with Blogger. I wasn't expecting that. It turns out that the constructor for the BloggerSink immediately contacts Blogger to find out which posts are live. This seems out of order to me, but, when I thought about where I would expect it to go, I started looking for a phase where back ends are initialized, and realized there is no such place. So I think I will have to consider this when revisiting main() and make sure that for each phase there is a clear "initialize front end processors" and "initialize back end processors".Interestingly, the drive loader did not seem to load anything from Google Drive during the configuraion step. But did it load the index file from disk? Or is that something else again? It seems like there may be more gremlins hiding in this code than I had realized.
Losing Control ...
It's usually about this point when a major fork appears in the road and it feels like I am losing control of the changes. This is happening here. The fork appears when you have to choose between nailing down one section of the code, and keeping the code working. There is no "one, true" path. This is the price that must be paid for not doing things right in the first place.In this case, I have the choice between taking the configuration all the way up front, or building some "scaffolding" to keep all the other cases working while I work through the entirety of one case (say the blogger portion). Given that there are no satisfactory solutions, it won't be surprising to hear that I have never found one. And, this time, as usual, I have decided that I don't like the inefficiency of building scaffolding so that I can keep jumping around in the code; I would rather tackle one part of the code and nail it down properly. On the other hand, I am doing the minimum for the moment to truly "clean up" the code; I am just moving what I have to around to make the configuration more "modular": the rest of the code is still hard-coupled together.
In some ways, this is a problem - for a while at least, I'm not going to be able to test anything. And I'm certainly not going to be able to release anything. And there is a lot of work left undone. On the other hand, this always was a huge task, and the configuration files are the one thing in all the projects that need to change (after all, we are trying to introduce modules into the configuration), so once we are through this phase, everything should settle down a little. So my "justification" is: we can't go all the way to our destination on a nice, smooth road, and this feels like the shortest amount of off-roading, so let's get on with it and back onto a road where all our (regression) tests pass again (i.e. all the existing cases where I have used the formatter).
One of the consequences of this choice is that some of the aspects of the configuration are going to have rough edges for now: for example, in refactoring the way in which we think about filesystems, the "Google Drive loader" is no longer specifically loading from Google Drive: it's loading from an abstract point in the filesystem. That needs to be cleaned up, but it's not urgent. We are also saying "why not allow multiple loaders?" but that is not going into the code just yet. So in these cases, we may have some scaffolding; and in others, the code may just be inconsistent. I'm not even 100% sure that by the end, when I declare "victory", it will all be done. Of course, if this were a professional project, the existence of end-to-end tests covering all possible cases would force me into making sure that was done. But in this case, I am just using my existing projects to drive the work forward, and when they all pass I will be happy.
Likewise, one of the biggest things I need to do is to make the system extensible, and that means pulling all of the module code out of the main body of code. As I'm going, I'm trying to move code into packages called "*.modules.name.*", but what I'm not doing yet is requiring the configuration file to have lines in it that look like:
module classnameBut that is a step I need to go back to as soon as I have finished the first go-around of modifying the configuration processor.
Then Just Like That ...
And then, suddenly (as always), I make a couple of changes and I'm out the other side. All the configurations load and parse, and when I turn the rest of the formatter back on I'm just a couple of bugfixes away from everything (mainly) working.The very fact that you are reading this says that I have all of the blogger code working at least to a first pass.
The one exception is that in reorganizing the configuration of the processor files, I explicitly said some variables would be used to configure "modules" - but nowhere did I specify those modules or make the code pass the variables to them. Consequently, those features simply don't work.
I'm not sure that was what I was going to work on next, but it is "divergent" - that is, I would expect it to create more problems than it fixes - so I think I'll tackle that next.