Friday, November 2, 2012

RubyConf Roundup: Day One

I just arrived in Denver and finished up day one of the always-excellent RubyConf.  I met a many smart engineers and saw some really informative presentations!  Here are some synopses of the talks I attended.

Yukihiro Matsumoto

I always enjoy Matz's keynote talks.  As the face of the Ruby programming language, Matz has a great "zen" approach to talks, where he speaks generally about programming for fun, personal fulfillment, and making the world a better place.

Matz talked a lot about finding motivation for projects, how motivation is hard to come by, and how it's the single most important factor in doing a good job on something.

Matz asked, "how do you become a language designer?"  Most folks in the audience had not designed or implemented a programming language.  Matz noted that programming in general is very similar to language design:  As a programmer, you design code, APIs, and interfaces.  "Programming is designing DSLs."

Matz noted that the world is full of bad designs.  He showed how he "hacked" shoelaces by super-gluing them so they wouldn't become untied.

Matz's approach to writing software, according to this talk, is to get inspired and "make it happen."

Matz then talked more specifically about the Ruby language and reflected on the scope of the project.  Starting as a pet project in 1993 (almost 20 years ago!),  it became hugely popular and Matz couldn't really process it while it was happening.

Matz noted that Ruby 2.0 will be "faster, more reliable, more tested, and more fun to use."  He gave a quick overview of Ruby 2.0's features but didn't go into detail (see below for that presentation).

Matz ended the talk by telling programmers to "be motivated, get coding, reinvent the wheel, and make the world better."

Nick Muerdter

In this presentation, Nick talked about using evented architecture to implement reverse proxies via (a) rack-reverse-proxy and (b) EventMachine Proxy.

rack-reverse proxy (RRP) adds much more overhead than EM Proxy for big data requests.  RRP buffers everything in memory, and that is the primary performance bottleneck in the system.

EM proxy deals with data in chunks, and as such, it's much more flexible when dealing with large amounts of data.  Nick noted that EM proxy is more flexible overall.

Some use cases for reverse proxies like these:

  • error handling
  • web page manipulation
  • inserting a standard JS analytics statement in all responses
  • implementing email servers
Content-length: You need to deal with content-length if you're modifying the response for client, and must keep it in sync with the actual content.

Gzip responses can also be problematic for reverse proxies.  You need to buffer the entire response to serve the gzipped content, since you can't break it up into chunks.

Nick noted that there's currently a "web services bonanza" in the federal government.  There's a big push toward interoperability, and many organizations are publishing new web services.

Nick's central theme: Look at different ways to architect your software.  Now that you can implement reverse proxies in Ruby, it's easier for them to perform advanced functionality like connecting to the database, communicating with APIs, etc.

Chris Hunt

Chris works at square, which lets anyone process payments easily.  They've made apps for Android and iOS, and have a central Rails API.  Creating a Service Oriented Architecture (SOA) was something their team had to do and had to become better at.

The Square team kept adding features to their product over time, have hundreds of model/controller files in their central Rails app, and have hundreds of thousands of lines of code.  They needed SOA, a better way to scale their growing architecture.

The SOA Creed, like that of UNIX, is to do one thing and do it well.  Usually, the idea is to make a new service for a group of new, related features, unless they relate strongly to existing features.

All of Square's SOA components run on JVM platforms (Java, JRuby, Jython, Closure, etc.).  They can deploy everything via Java, and this normalization is a big win for them.  New components simply require the JVM, and they can run on any server in any data center.

Chris walked us through a "boilerplate" service, which set up documentation, testing, code quality metrics and more for new services.

Chris showed how his team uses FDoc to enforce documentation across services.

A big inspiration for Chris was the book Growing Object-Oriented Software, Guided by Tests.

To test, Chris' team makes fake version the other services, usually in Sinatra, so he only needs to describe the service endpoint.  He uses the tool foreman to boot up fake services and use them for development and test environments.

Square uses cane to enforce code quality standards across different teams and services.  If code quality or style isn't up to snuff, it actually breaks the build (awesome!).

Chris uses cubism.js to visualize time series data from errors and other activity.  He uses the enterprise tool splunk to manage log files and track, say, a payment id, as it goes through different services

Security:  Square uses SSL certificates for mutual authentication between services, as opposed to one-way SSL on websites most of us are familiar with.  Every service has an SSL certificate, and each one has an organizational unit (service name).  This lets their team ensure that a service is authorized for a given action, implemented as a before_filter in Rails controllers.

Akira Matsuda

Ruby 2.0 is scheduled for release in February 2013.  Akira, on the Ruby core team, walked through the new features.  Lots of this information is available elsewhere, so I'm going to punt on some of this stuff.

Ruby 2.0 is 100% compatible with current stable version of Ruby (o rly?).  The feature freeze was one week ago.

  1. Module#refine and Kernel#using let programmers "monkey patch" code while keeping the scope constrained to the redefining module.  This makes monkey patches much safer and doesn't pollute the global scope. Kernel#using lets you apply that refinement to a specific scope.  Akira went through some example by refactoring ActiveRecord and ActiveSupport to use refinements.  Read more here.
  2. Module#Prepend lets programmers include functionality from modules into a class, similar to include, but now methods from the module take precedence over those in the class.  This is a much safer alternative to alias_method_chain, found in Rails, and other monkey patching techniques.  Read more here.
  3. Enumerable#lazy lets you perform actions on enumerable objects without evaluating the entire collection.  Akira showed how we could do some processing/evaluation on a range from 1 to infinity, or on a list of all dates, while completing in a reasonable amount of time. Read more here.  
  4. Keyword args.  In Ruby 2.0, you can easily accept hash keyword arguments with default values.  This lets you specify and validate hash keyword arguments and provides a cleaner way to deal with multiple hash parameters (like options and html_options in Rails' button_to, for example).  Read more here.

Matt Duncan
[no slides yet]

Matt Duncan went over some good techniques for reducing downtime during web application deployment.

Matt suggested developers look at a traffic graph and decide what trade-offs need to be made for reducing deployment downtime.  Can you run a migration when traffic is low, if it won't affect many users?  Can you throw money at a faster database to mitigate the situation.

Matt recommended you separate code update deployments from database migrations, the latter of which he calls a "database deploy."  He suggested folks try:

  1. Having an initial migration, just to add new fields, as that introduces little to no downtime
  2. Have a separate rake task to update records as needed, outside of the migration-induced database lock

He suggested developers use "CREATE INDEX CONCURRENTLY" a Postgres feature, to create DB indexes without causing downtime.  (Not using Postgres?  You're out of luck!)

When updating job queue code, he warned developers that "queues are almost never empty," and to take that into account during deployments.

Matt recommended that developers set version numbers on all external services and make it easy to "roll back" external services in case of failure or errors during deployment.

Matt noted that in many cases, developers can disable a feature while running an complex/expensive deployment to update it.  Say you need to deploy an update to search backend like SOLR, for example.  Perhaps you can (a) deploy a change to hide the search bar in your application, (b) update the search service nicely,  and (c) add the search bar back.  This seemed a bit crazy to me and would require lots of extra effort.

Matt recommended that folks check out rollout, a library to gradually roll out services to a certain percentage of users at a time.

Ben Orenstein
[no slides]

Ben walked us though several examples of refactoring Ruby code.  Ben showed some tell-tale "code smells" and applied advanced object-oriented techniques to refactor the code.  IMO, talks like this really show the maturation of the Ruby ecosystem over the past few years.

Ben demonstrated:

  1. factoring out functionality into private methods so developers didn't have to read it right away
  2. recognizing when classes have "feature envy," or lots of functionality related to a different class
  3. using a single concept (like a range) to represent multiple entities (like a start and end date)
  4. reducing coupling between systems
  5. creating a situation-specific null object so his code could "tell, not ask"
  6. depending on abstractions for related functionality, making it easier to mock and test

Ben also showed us examples of good times to refactor:

  1. All the time :)
  2. When you have "god objects," or objects that try to do everything.  In Rails apps, this is usually (a) the user model and (b) the central object related to your specific app (like "Order" for e-commerce apps).
  3. When you see high-churn files, or files that change often.
  4. When you have lots of bugs surrounding a specific feature. 

Michael Bleigh

Michael talked about CORS, or Cross-origin resource sharing.  Background information available here.

CORS:  As long as other domains implement the protocol, users on one server can get resources from other servers.  Often, this means we can make an ajax call to a different server.

"Like most good things," this doesn't work in IE < 10.

With CORS, your site has access control headers that show whether a given domain is allowed to make ajax requests.  You can specify acceptable domains, or use a wildcard to allow all domains.

There's an additional "options preflight request," triggered when the request is not a simple GET or  has custom headers.  This helps to prevent cross-site scripting and the like.  The preflight response shows access-control-allow-origin or access-control-allow-methods headers, telling the client whether it's allowed to make the request.

CORS disadvantages:

  1. Initial complexity
  2. Cross-service communication
  3. Lack of framework

Michael recommended that Rubyists check out the rack-cors gem.

Michael argues that using CORS for public APIS is a HUGE win, because it's a lower barrier to entry for external developers.  You don't need a server to make requests to your API; it can all happen in the browser.  A developer could implement a locally-hosted HTML file that makes requests your API.  In other words, it's roughly the same burden on you, but much less on third-party developers.

Michael also walked through some security concerns in terms of handling authentication for public CORS-based APIs.  For example, you can't count on secrets since everything is exposed on the front end, and you still want to identify the requesting application.  Github's solution is to only allow CORS requests from domains of registered applications.

Questions?  Comments?  Let me know!  Read more about all the talks here.

No comments: