Naming is half the battle

In our goal to come up with great names for our objects, we took a wrong turn. Although we have only imported data from files for the past 18+ months, we decided to pick a more general name for our objects that represent any kind of data we might import regardless of whether it came from a “file” or not. So instead of ImportFile, FileLayout, etc, we have Dataset and DatasetDefinition. Not bad I guess. On syntax alone you can not argue these name choices. However, the reason our model is littered with these general names is because we wanted flexible names that could adapt to the future without having to change the names. (As you might imagine, there are many more discrepancies than the two I am mentioning here.) This general naming is not horrible on its own, but it perpetuates this idea of creating “general” software that is not specific to its core concern: importing data files. Thus when considering new design, changes and maintenance we have to consider some unknown future and this flexibility. With any change we have to think about the 4 or 5 things I listed in the other post that has nothing to do with the core business of importing a data file.

 

In summary, this simple naming decision to use more general names created 2 problems for me in the long run.

  1. Since the users only import files, they only speak of Files and File Layouts. The words Dataset and Dataset Definition are never spoken. Thus one of the beauties of business-object-oriented development is lost: persistent language across the domain and object model.
  2. By designing with an unknown, or rather unspecific future in mind, we created dependencies on something that does not exist and perpetuated that idea. When working with the code that defines the file (a dataset definition) we have to try to recall what we meant by Dataset and how our changes might affect a broader non-existent design.

The future may not be completely unknown. It may be true that we will import data that does not come from a file someday. However, the future is not specific. Specifics happen when we have actual use cases, we have actual test data and scenarios we can walk through. When the day finally comes that we have to import data from something other than a file we should analyze the requirements and design the solution best for that need. The way we import files now should not corrupt that design, just like our current design should not have been corrupted by this possible future need. When we think we have a whiteboard design of the new subsystem ready we could then compare the two and see what similarities (code) they have in common. We should then obviously share that code and collective knowledge as long as we don’t corrupt one or the other designs.

 

Some of my original ramblings about accidental complexity.