Taylor Davidson · Why I started using Github to version control spreadsheets

How I started using new ways (to me) to manage financial models for Foresight.
by Taylor Davidson · 28 Feb 2019

If you’ve ever built a financial model, you’ve likely encountered the joy of digging through emails and lists of files with similar-sounding file names, trying to figure out which file is the most recent version; or integrating comments and edits from multiple people and their specific versions into one “master” model. The longer the file name, the higher the version names, the more one questions the entire process.

I tend to follow semantic versioning for my template financial models, and for years, I’ve used the version numbers in the name of the file. Highest number means latest version, no problem; versioning in the filename made it easy for me to see at a glance which file was the latest, or to communicate to users the degree of changes in the version simply with the version number. It works well at the beginning, but once you get to version 61, with derivations and side branches and commented notes from other users embedded in the file names, it gets extremely messy. [1]

Enter Git, Github, and xltrail

Software developers, of course, have a solution to this, and it’s Git. Git, and other version-control systems, combined with Git services like Github, enable to developers to work on software projects with multiple people and track changes at a very granular level within the files that make up their projects. Spreadsheet developers, however, are generally limited to version control at the project level - take all the changes or none. Git does not work with binary files and only knows if the spreadsheet file has changed, not the lines that did change.

I recently came across xltrail, which offers a solution to bring proper version control to Excel spreadsheets: use Github to store your Excel spreadsheets and authorize xltrail to access your Github to check for new commits, and xltrail will provide you detailed, cell-by-cell comparisons between files of what was edited. I tested it out and quickly learned that the proper way isn’t just to use it the same way you use a folder on your computer, it takes a bit more thinking to do it right.

Felix Zumstein of xltrail recently wrote a post, How To Manage And Release Excel Files On GitHub: A Primer, which helped me understand best practices on how to use Github to store and version-control spreadsheets; getting rid of version control in the filename was step 1, but it was also helpful as a primer on how to use a few Github release options to communicate what was changed and why. And even though we may keep the filename the same inside a Git repo, we can still release versions with semantic version names in the file name if it’s helpful for users. Well worth a read →

Why use version control?

Why use version control? To be clear, I don’t use version control for spreadsheets that are fairly raw, I tend to work on those just inside my project folders, and then once I’m getting close to release I move into version control so I can see the changes more finely. Using Github for private files without xltrail or an engine to pull out the differences (“diffs”) between versions isn’t terribly different than just using Dropbox to store the files, but moving to Git - and services like Github and xltrail - creates more rigour to versioning and editing, better collaboration tools once we do need to share files with people, and - I think - opportunities for broader creation of shared tools and templates. [2]

I’ve also begun testing using Github as a way to distribute my financial model templates and collaborate with spreadsheet developers. I released the Runway and Cash Budget Tool and Cap Table Tool to Github (foresighthq/runway-tool and foresighthq/cap-table-tool, respectively) as an alternate way to provide these tools free to people, but also with the goal to open up to collaboration. Want to improve on the tool? Open an issue, fork the project, and add your own edits and components. Open source spreadsheets, in a way. [3]

My goal with the spreadsheets has always been to make financial modeling easier, better, more reliable, and more accessible to people, and opening up some of the tools to platforms like Github is one way I’m testing to deliver that. Understanding version control, releases, commits, and more are important competencies to be able to leverage the platform, so I’m essentially backing into using version control in order to work towards the bigger goal; but at the same time, it opened me up to a new way to manage spreadsheets, and hopefully reduce the number of v61 best case TD final files littering the hundreds of project folders on my computer. After building financial models for 20 years, why not learn new ways to do things?

  1. And even a bit soul-sucking to know that you’ve been through that many versions. ↩︎

  2. Google Sheets does offer version control, of course, but it’s been hard for me to use it to natively build detailed, complicated models in Sheets, I’m simply 10x faster in Excel. ↩︎

  3. Yes, the templates are not open source at the moment, but they are free for people to use, build on and share, with attribution, for non-commercial use. See Creative Commons license terms → ↩︎