On October 6th, THE SUGGESTION THAT WOULD LIVE IN INFAMY. It immediately generated some frenzy, and I'd had a bad day at work, so on a whim I decided to finally learn Canopy, a Selenium library for F#, and try my hand at a first pass of a migration of issues from UserVoice to GiHub Issues.had had enough. He'd grown tired of spending time working on submissions to UserVoice for F# language enhancements and after some prodding submitted
Canopy is really a fantastic library. In essence it's a tightly-crafted DSL on top of the Selenium WebDriver library, which allows for easier automation of web browsers though code. It's most commonly used for UI acceptance tests in my experience, but we're going to pervert its intentions a bit.
But first, some links:
- The old uservoice is here: https://fslang.uservoice.com/forums/245727-f-language
- The new language design repo is here: https://github.com/fsharp/fslang-suggestions
- The repo for the code that ended up generating the new language design repo is here: https://github.com/dsyme/fsharp-lang-suggestions-scripts
And please don't judge to code quality too harshly, Jared and I were kinda blazing through this thing in off hours!
- Discover the list of all issues on UserVoice
- Pull over metadata for each of the issues:
- Official Responses
- Use that metadata to create a matching issue on GitHub
So to start with, after hashing out some requirements we landed with the following set of base models:
With these models in hand, we can now go parse sections of the each page out to get each Idea. All of the code directly related t scraping the pages is here. An interesting point is that there's no master list of all issues for a forum, you've got to go scrape through each different category of issues to get the entire list, so that's what we've done in the discoverIdeas function.
Next was the grunt work of cycling through each of the links we just found and parsing them. This is a hairy chunk, but relatively straightforward:
After everything was parsed, we did a quick Json.Net dump of the Idea list into a file so we wouldn't have to scrape that again. It took about 20 minutes to scrape all of UserVoice because Canopy seems to implicitly have a single webdriver context. I thought about parallelizing by looking into passing contexts around to use, but I couldn't find a way to do that and once we had the data everything else fell into place.
Having secured the data, we then needed to make markdown issue templates to render the ideas into a form that would look good on the GitHub Issue. My preferred templating library in .Net is DotLiquid for its simple setup and reliability. If you'd like to see what the template looks like, take a quick detour and check it out here. Simple, no? I did have to do a bit of configuration to get the templating system to recognize F# records, but luckily that work had already been done by (I think) Tomas Petricek and Henrik Feldt. Basically what we have to do to use the templates in a nice way is register the public members of any type we want to use as a model in the template, and there are some special cases around Seq, List, and Option that have to be handled for dotLiquid to work. The code looks like this:
The code for uploading to GitHub isn't really complicated at all, and lives here. It's mostly a straightforward mapping of Idea -> Issue GitHub Model and then we POST that model up. We do do a few interesting things like assign labels and close the issue automatically if the issue was closed/rejected/etc on UserVoice, as one would expect.
Overall, I'd say that two people spent ~10 hours each working on this, and A LOT of that time was waiting on uploads to break. We spent some time tweaking the issue templates for formatting, but that was a very iterative process. Canopy made it incredibly simple to grab the data we needed to do the migration, DotLiquid made it easy to render nice markdown for the Issues, and OctoKit...OctoKit wasn't the worst.
My overall hope with this post is to show that F# is as great for relatively simple, one-off tasks as it is for more in-depth projects like dependency management, finance, microservices, and any number of other realms (though I do use it there as well). So I encourage everyone who's looking for ways to include it at work to try and tackle your next annoying problem with a fsx script. That's how this project started, until it grew too unwieldy.
Happy Holidays, and Merry F#ing!