Transforming Hashes: A Refactoring Story
One of the things that made me fall in love with Ruby and Rails is the fact that if you're thinking to yourself "there's got to be an easier way to do this", there probably is. The Ruby community does a great job of fulfilling the lazyness virtue. If something seems hard to do, chances are some pioneering Rubyist has already done it and open sourced it. Between RubyGems, GitHub and StackOverflow, we've got a great toolbox to pull from.
One of the problems I still haven't found a great tool for is transforming complex hashes. I'm not sure if I'm not Googling the right thing, not looking in the right places, or just not thinking outside the box enough but every time I have to transform a hash, it comes out looking kludgy. But still, some code is better than others. Let's take a look at a method I wrote recently.
The Problem
So I've been working on a project that syncs user account data to Google Apps. Google has a gem for working with its APIs. But like so many gems provided by non-Ruby houses, it's a bit awkward to work with. There's plenty more that could be said about that, but for now let's just focus on the format it wants it's hashes in:
{
primaryEmail: 'milton.waddams@initech.com'
password: 'No salt',
name: {
{
givenName: 'Milton',
familyName: 'Waddams'
}
},
organizations: [
{
department: 'Reporting',
title: 'Collator'
}
],
includeInGlobalAddressList: true,
orgUnitPath: '/BasementDwellers'
}
And here's what these values look like in our application:
{
address: 'milton.waddams@initech.com',
password: 'No salt',
first_name: 'Milton',
last_name: 'Waddams',
department: 'Reporting',
title: 'Collator',
privacy: true,
org_unit_path: '/BasementDwellers'
}
If you're lucky, you work in a controlled environment where you don't have to deal with this sort of thing. But if you're like me and constantly interfacing with external APIs, maybe you've had a similar experience.
If it was a simple case of swapping keys, this would be pretty straightforward but the use of nested hashes, arrays and inverted values makes things tricky. I'm sure there are some rockstar programmers out there who can envision clean DRY code in their head from the get-go. But I'm the type that I needs write it, turn my nose up at it, write it again, rinse and repeat until I'm satisfied. Let's take a look at my first stab at this problem.
Round 1: Simple and Naive
def self.to_google_params(params)
google_params = {}
if params.has_key? :address
google_params[:primaryEmail] = params[:address]
end
if params.has_key? :password
google_params[:password] = params[:password]
end
if params.has_key?(:first_name) || params.has_key?(:last_name)
google_params[:name] = {}
if params.has_key? :first_name
google_params[:name][:givenName] = params[:first_name]
end
if params.has_key? :last_name
google_params[:name][:familyName] = params[:last_name]
end
end
if params.has_key?(:department) || params.has_key?(:title)
org = {}
if params.has_key? :department
org[:department] = params[:department]
end
if params.has_key? :title
org[:title] = params[:title]
end
google_params[:organizations] = [org]
end
if params.has_key? :privacy
google_params[:includeInGlobalAddressList] = !params[:privacy]
end
if params.has_key? :org_unit_path
google_params[:orgUnitPath] = params[:org_unit_path]
end
google_params
end
Every bit of code is a trade-off. Here's what I see as the good and bad in this code.
Pros:
- Straightforward - Nobody's going to have a problem figuring out what this code does.
Cons:
- Lengthy - Almost 50 lines is definitely longer than most methods should be.
- Repetitive - This code far from DRY.
- Unwieldly - Making changes to this code, while straightforward, isn't a quick and easy thing.
Round 2: Shorter but Messier
def self.to_google_params(params)
# Our style is exclude = true. Google's is include = true. So we need to invert the value.
params[:privacy] = !params[:privacy] if params.has_key? :privacy
top_mappings = {
address: :primaryEmail,
password: :password,
privacy: :includeInGlobalAddressList,
org_unit_path: :orgUnitPath
}
top_params = params.slice(*top_mappings.keys)
google_params = Hash[top_params.map {|k, v| [top_mappings[k], v] }]
name_mappings = {first_name: :givenName, last_name: :familyName}
name_params = params.slice(*name_mappings.keys)
google_name_params = Hash[name_params.map {|k, v| [name_mappings[k], v] }]
google_params[:name] = google_name_params if google_name_params.any?
org_keys = [:department, :title]
if params.slice(*org_keys)
google_params[:organizations] = [params.slice(*org_keys)]
end
google_params
end
Here I'm starting to feel like I'm heading in the right direction, but I'm still not happy with this code.
Pros:
- Shorter - At almost half the size of our original method, this is getting to be a more reasonable length.
- Semi-Wieldly - It's not too hard to isolate the hashes in here and make changes to them.
- DRYer - There's still some repetition, but it's definitely a bit DRYer than our previous code.
Cons:
- Complex - It takes more than a couple brain cycles to figure out what this code is doing and how.
Round 3
def self.to_google_params(params)
def self.map_it(hash, mappings)
sliced_hash = hash.slice(*mappings.keys)
Hash[sliced_hash.map {|k, v| [mappings[k], v] }]
end
# Our style is exclude = true. Google's is include = true. So we need to invert the value.
params[:privacy] = !params[:privacy] if params.has_key? :privacy
google_params = map_it(params,
'address' => :primaryEmail,
'password' => :password,
'privacy' => :includeInGlobalAddressList,
'org_unit_path' => :orgUnitPath
)
name_params = map_it(params, 'first_name' => :givenName, 'last_name' => :familyName)
google_params[:name] = name_params if name_params.any?
org_params = map_it(params, 'department' => :department, 'title' => :title)
google_params[:organizations] = [org_params] if org_params.any?
google_params
end
And finally we have the final code I actually wound up committing. It's certainly not perfect, but perfect is the enemy of good and I felt that spending any more time on this code would be a waste of my time. Here's where I think we landed.
Pros:
- Short - We've managed to keep the code about as long as the previous iteration.
- Maintainable - I feel like the use of hashes and the method names, makes this code easy to understand and maintain.
- DRY - Again this code being mostly hashes and method calls means that it's pretty DRY.
- Not too complex - The
map_it
method is a bit more obtuse than I'd like, but because it's isolated in it's own method, it doesn't bother me too much.
Cons:
- Code smell - Is having a nested method like this a code smell? I tend to think that it is, but at the same time it still seems like the right thing to do for this code. Really, I suppose the
map_it
method should be an instance method onHash
but I really prefer not to monkey patch if I can avoid it.
Well, there you have it. How do you think I did? Like I said, it's not perfect, but I feel that it's good enough. How would you have done things differently? What did I do wrong? Go ahead. Be brutal. After all, criticism can be better than praise.