Netflix Android Bootcamp powered by CodePath

We’re excited to share that Netflix is sponsoring CodePath’s next Android class for professional engineers starting March 6th.

Netflix joins Facebook, Airbnb, and Uber as the latest technology company to partner with CodePath in offering the professional developer community a path towards mastery in high demand technologies via our intense, free evening bootcamps.

At CodePath, we believe that effective technical education must be directly powered by industry leading technology companies. We’ve seen first-hand that there is no better way to strengthen your relationship with the developer community than by offering unique pathways for engineers to master new and exciting platforms.

CodePath has previously taught thousands of professional engineers at over 800 technology companies in Silicon Valley and has run Android onboarding for Facebook and Airbnb. Course participants benefit by being able to uniquely see how they rank across thousands of other top engineers worldwide.

Read more about the announcement on the Netflix blog.

If you are a practicing engineer with at least 4 years of professional experience, you can apply to the Netflix course here.

Take a free iOS/Android 8-week class at Airbnb with CodePath

Airbnb and CodePath are partnering together to offer an all-expenses paid Android and iOS mobile 8-week development course to 50 external engineers and Airbnb Connect participants.

Over the past few years, CodePath has trained 2000+ engineers at 100+ companies in Silicon Valley. We’re proud of the fact that no individual has ever paid for a CodePath course.

We believe the best education in the world should be freely accessible and we’ve been fortunate to work with great companies like Facebook, Uber, Airbnb, Box, and others to help make this vision a reality.

Alumni of CodePath classes include one of the co-founders of Reddit, a former tech lead for Apple Watch, head of Android for Coursera and many now prominent iOS and Android developers across Silicon Valley.

Our curriculum has been battle tested at many of the top companies in Silicon Valley and we’re excited to partner with Airbnb to bring this curriculum to engineers in San Francisco and all over the United States remotely.

Mariya Presenting

The class starts on October 11 and takes place Tuesday/Thursday from 7-9PM for 2 hours. You can apply for either Android or iOS. We also offer a remote Android and iOS option for all US-based applicants and highly encourage you to apply.

The deadline is September 26th. Apply now! It will take you time to finish your pre-work so start as soon as you can.

The class is based on interactive lectures and labs. You are actually coding in class, not copy-pasting from a book, like you may have at previous corporate training seminars. You will also complete app assignments and group projects that will enable you to build fully functional apps for either Android or iOS platforms.

At the end of the program, your group will compete for prizes at Demo Day and you will join our CodePath alumni network which provides access to curated events, opportunities to network with other alums, and priority selection for future classes.

Students will get to interact with Airbnb engineers throughout the course and Airbnb will offer top performers the opportunity to interview for a full-time role within Airbnb.

If you’re a software engineer with 1+ years of work experience who wants to learn mobile, apply now: http://www.codepath.com/courses/airbnb

1+ years of work experience is required, if you’re a college student, apply for CodePath University.

Uber and CodePath partner to launch free 8-week Android class

We’re excited to announce that CodePath is partnering with Uber to provide an 8-week evening Android course for experienced engineers this summer.

The purpose of the course is to expand mobile engineering opportunities in the developer community and provide engineers from non-traditional backgrounds an opportunity to take a free, accelerated mobile engineering course.

Twenty-five students and 5 Uber engineers will be selected to attend in person at Uber’s San Francisco headquarters (some engineers will be approved to attend remotely).

Read more about the announcement on the Uber Blog or apply to the course here.

Teaching Coding Skills in Developing Countries

group2

This spring I enrolled in a CodePath mobile development course. The experience was remarkably positive and beyond what I had expected; so much so that I was inspired to share their curriculum with an unlikely audience: a group of 20 individuals in Haiti with little-to-no prior programming experience. I was born in Haiti and lived there until I left for college, and now I live and work in San Francisco. Having those two vastly different perspectives and experiences has made clear to me how underserved communities are not on equal footing when it comes to participating in opportunities created by the internet, even when they are able to passively access and engage with it.

CodePath provided access to their Android curriculum to launch the program this past summer. I chose the students from more than 150 applicants — most of whom were recruited through a Facebook ad — and I included a mix of high school and university students, as well as a handful of professionals. Carly Baja, a Haitian developer whom I coached on the CodePath Android material, taught the two-month class starting on July 4th. He spent 20 hours in class with the students every week to discuss the video lectures and offer guidance on the assignments. Saint-Louis de Gonzague, a high school in Port-au-Prince, was kind enough to allow us access to their computer room.

To ensure a reasonable course completion rate, we emphasized two key components: commitment and collaboration. One of the more challenging aspects of learning to code is the willingness to put in the hours it takes to attain proficiency. When first learning to code, it’s easy to feel discouraged and abandon a potential career in programming when struggling to understand key concepts or trying and failing to fix a bug for hours, especially if one lacks the support system to persevere. Within the first two weeks of our course, nine students dropped out or were asked to leave the program because they did not demonstrate the level of commitment required for the course. Paring down the class to only those who were willing to put in the time and effort allowed the remaining students to participate in an environment that encouraged favorable outcomes, and allowed us to make the most of our limited resources. In addition, we encouraged all students to collaborate by helping one another on projects and by creating an online discussion forum to share information and collaborate. Many of the students were unaccustomed to working in groups, but quickly came to understand the value of working collaboratively as it led to them progressing more quickly through the material.

We worked hard to understand and adapt to each student’s specific needs. Through student interviews prior to the class, we realized that many students lacked the infrastructure at home to complete homework on their own, so we provided ample time during the session for them to work on their assignments. Given the slow internet, which worked only intermittently, we downloaded all of the lecture videos and setup files ahead of time, and translated common terms for the students who were not native English speakers. The biggest challenge, however, was something far less obvious: most students lacked the inspiration and confidence to even believe they could become successful developers.

Most of us don’t realize that for many people, minorities in particular, the biggest barrier to success is the lack of conviction that they, too, can succeed in certain professional fields.

Over the years, I have observed that many minorities, including the people in this bootcamp, have never met nor even heard of a successful software engineer from a background similar to their own. Therefore, they seldom pursue these kinds of opportunities even when they are seen as desirable professions. In Haiti, computer science is not yet a well-known career path and there are not many visible role models in the industry. As a result, I spent a lot of time one-on-one with the applicants to share my personal story as well as the stories of those around me to inspire them to embrace a field that feels so very foreign and unattainable to them. Knowing others like them who have gone through a similar process helped inspire the confidence that they too could persevere through the most challenging parts of the course and find success. Unfortunately, I feel that coding bootcamps often fail to address these concerns for minorities and, as a result, both enrollment and completion rates are not where they ought to be for those underrepresented groups.

At the beginning I felt like an outsider because I was the youngest student and one of only 3 women in the class. My parents also told me I should do something more “attainable” instead. But I didn’t want to give up and after a while I fell in love with the subject!

— Sarah, high school student in the course

2015-08-28 17.25.32 like a boss

On the last day of the program, we organized a demo day and invited the students’ family and friends as well as key members of the community, including potential employers. The level of excitement we saw was unprecedented. A number of local companies, in dire need of developers, approached us to discuss internship opportunities.

Though the students have developed a strong foundation in Android as demonstrated by their projects, this experience is only the first step in a long journey ahead of them. We need to help them further their coding skills by giving them continued access to infrastructure — laptops, Android devices, internet, etc — mentorship as well as viable opportunities for work in the field. We also envision scaling up our program in order to reach more students in Haiti and one day hope to launch in other underrepresented countries as well.

The experience in Haiti further concretized the importance of leveraging the internet to empower all people, not just a small percentage of those for whom resources are never lacking.

It’s one thing to ‘connect’ the world with the internet, but it is a completely different and far more meaningful goal to provide everyone with the tools they need to participate in the global internet ecosystem.

A number of tech companies have invested resources in expanding their reach but have yet to give people around the world the tools necessary to solve their own problems and to become equal participants in an ever-changing technological world.

FullSizeRender

It’s Time to Use Interface Builder

When I first learned Cocoa, I didn’t get Interface Builder. It seemed like a beginner’s tool that gets in the way of anything more complicated than a tip-calculator. For years I worked on big teams that decided IB was off the table.

Last year I reviewed CodePath’s curriculum, which starts students off with Interface Builder. Since classes are target senior engineers who wanted real-world experience, I considered killing IB and forcing everyone to go programmatic. But I knew I first had to give it an honest try.

Well now I’m an Interface Builder convert. It’s a great fit for most projects, even “at scale.” Not only is it a timesaver, but it’ll lead to fewer bugs than going purely programmatic. The advantages should outweigh most anxieties. While there are situation where it won’t fit, it should be the first tool an iOS developer reaches for, and it’s absolutely critical to master.

Auto Layout

Auto Layout is a powerful tool for dealing with device fragmentation, localization, and really anything with dynamic layout. However, for complex layouts, it’s hard to juggle all of a screen’s constraints in your head. With size classes in iOS 8, it’s only going to get worse.

If there’s one thing that’s tipped the scales toward Interface Builder, it’s real time feedback for Auto Layout. When you make mistakes programmatically, runtime errors are cryptic, and it’s frustrating to constantly re-running code in the simulator to test fixes. In IB, they’re actually highlighted. It’s a debugger for views.

Static Table Views

Every app has a setting screen. They all have some labels and text fields and switches, and they’re all powered by the same boilerplate code. You could waste an afternoon writing it code, slipping in some off-by-one errors. Or you could offload it to a library you wrote yourself, which you now have to maintain. Maybe there’s a third party library, but you’re back in the same position of offloading code to someone else. Why not give it to Apple, where it’s more battle tested?

WatchKit

While Apple has always Interface Builder, they’ve provided programmatic alternatives on equal footing. Things started to change in Xcode 6, when all new project templates used Storyboards.

Now, WatchKit apps require Storyboards. You cannot configure your UI programmatically.

The Trade Offs

It’s valid to have anxiety over Interface Builder, but I think it’s overblown.

My favorite part of teaching senior engineers from other platforms is getting a fresh perspective on problems. I asked them to play devil’s advocate after trying IB for a weeks.

“It Feels Magical.”

If you’ve used a “magical” framework on a successful project, you’ve been bit. Maybe the project changes direction, or you’ve been successful enough that scaling becomes an issue. In reality, you didn’t save time. You just delayed solving the problem.

While Interface Builder looks magical, it’s pretty simple. Anything you can do in IB, you can map 1:1 to code, and it’s straightforward to follow what IB files do. The Cappuccino team actually built a tool that convert XIBs for use with their web framework.

“What about performance?”

Performance can mean a few things. Memory?

The memory usage between methods should be nearly identical. There is a temporary memory usage increase for using nibs or storyboards, but it is likely very minor compared to the views themselves. Similarly while loading and parsing a file is not free, in many cases the cost of doing so is minor, and has absolutely no effect on the performance of the final generated UI.
– RinceWind, Apple Developer, on the Developer Forms

Performance could mean CPU Time, and Matt Gallagher benchmarked it. Out of the box, IB up to 17% faster than programmatic layout. After some tuning of the programmatic code, it was 5-10% faster. In his conclusions, he said:

Don’t assume that NIB files are always slower than generating views in code — it is not always true. While in general, generating user interface views in code appears to be 5-10% faster than loading from a NIB, the reality is that this difference is small enough that it doesn’t matter and there are certainly some views that load faster from a NIB than from code.

His deliberately contrived setup has a 1.8 second load time, and the difference was only about 100 milliseconds. It was also written in 2010. In real world apps on modern hardware, the difference will be negligible.

On top of this, consider that tableview cells are recycled. So even if it costs a few milliseconds, it’s paid when the screen loads. It should not impact scrolling performance.

Given the low risk, and how little it costs to implement, don’t worry about performance until you’ve benchmarked it in your own app.

“How will it scale?”

If you cram everything into Main.storyboard, you’re doing things wrong. It’s no different than if you try to fit all of your app inside one class in one file. Just like code, you refactor IB files.

As your app grows, split out flows into their own .storyboard files. Then load them programmatically.

let nav = UIStoryboard(name:"SignupFlow", bundle:nil).instantiateInitialViewController as! UINavigationController

When a single flow gets too complex, you can break things down until a storyboard contains a single screen. When that gets too big, generate the controller programmatically and load individual views via XIBs.

“What about merge conflicts?”

As we refactor our Storyboards into smaller IB files, we reduce the risk of conflicts. That said, there’s still a risk, and yes, hand-merging these formats carries risk.

As iOS developers, we have to contend with a much more dangerous file: Xcode projects. Most teams just deal with it. Big companies like Google worry about scaling it, so they built Generate Your Project, which generates IDE project files as part of a larger build process, taking its management out of the hand of the developers. It’s great for Google, but overkill for 99% of apps.

Beyond all that, we already deal with large opaque binaries in our apps: image assets. We don’t draw all our assets programmatically in the name of purity.

You may cringe with that comparison. “Image assets are a designer’s problem!” I wonder if layout should be, too.

Early video games were entirely developed through code. Today, game developers invest in level editors so artists can focus on the work. Today, many iOS teams expect designers to hand engineers an annotated PSD, and the engineer to transcribe it to code. So long as you clearly define boundaries, it seems better to let an artist build out the art and let the engineer focus on integration.

Conclusions

Obviously there apps where IB is the wrong tool for the job. I just think they’re in the minority.

Before you judge IB, ship a real project with it. Once you get past the learning curve, the productivity gains far outweigh the risks.

CodePath’s Last Demo Day of 2014

CodePath Fall 2014 Demo Day

by Michael Ellison on November 18th, 2014

10416604_10203155347670030_2798127871599900309_n IMG_3823 buy discount bridesmaid dresses at theluckybridal new collection bridesmaid dresses discount sale

On Monday, November 10th, we celebrated our last Demo Day of 2014 and allowed designers to compete for the grand prize for the first time.

Over 120 engineers and designers from more than 60 companies including Apple, Google, Facebook, Dropbox, and Box competed for over $15k in cash prizes and presented to an all female panel of senior technology leaders.

iOS for Design Winner

Dogma by Jeff Smith, Kyle Pickering, Loren Heiman

Android Winner

Walk With Me by by Rebecca Rich, Maryam Quadir, and Debangsu Sengupta

iOS Winner

Parkway by Sanket Patel, Swaroop Butala, Andrew Chao

Event Recap:

Special thanks to our amazing judging panel!

shop beach and destination wedding dresses

 

 

 

 

 

Thinking bigger: a free engineering school

“What do you do?” or “What does your startup do?” are tired questions in San Francisco, a city where you are measured by your works. The superficial startup ideas, the digital smugness, and the unnatural culture of tech celebrity are sometimes enough to make you want to throw your smartphone into the ocean and dive in after it. Go away from San Francisco for a while though, and you’ll start to remember the thing that drew you in the first place. You remember that, beneath the fads and clumsy attempts of first-time entrepreneurs, lives a city of dreamers and doers defying cynics and conventional wisdom.

Our dream is to create a modern school for engineers to attend throughout their career. Even more audaciously, we want to offer this school for free. Various experiences have led us down this path: our experience as founders exposed us to the pains of hiring, and our experience as engineers gave us the aspiration to be lifelong learners. We believe we can connect the dots and create a better solution for both.

A founder’s perspective

A founder really only has two critical roles: company direction and hiring. A common mantra in the startup world is, “execution is everything”, and, ultimately, it is the team that must execute the vision. Founders often share the trait of being control freaks, so letting go of that control is one of first challenges that they will face. That doesn’t mean letting the company run amuck because setting company direction and expectations is the second role of a founder. I don’t have any sympathy for founders that gripe about their team. All I see is a founder that has failed in their only job.

Unfortunately, hiring is brutally hard. Referral is still the best way to hire, and when that well runs dry, the options are not good. Sometimes, I felt the only thing we could do was watch the months slip by as we tried job boards, contingency recruiters, in-house recruiters, engineering speed dating, engineering auctioning, and any number of ridiculous things.

I find it incredible that large companies use talent acquisition as a major strategy in hiring. I understand how it happens though, when you have executives staring at quarterly hiring targets. Acquisition is probably the only way to expedite the hiring process, so you’re left to solve the challenge of hammering together engineers from different company cultures into a cohesive team. To me, though, that’s like panning for gold when you can build a gold factory instead.

In fact, there is a vast supply of engineers. Engineers who are talented, humble, and eager to learn and contribute. Why, then, do companies drool over Ivy league new graduates whose only experience is school projects when there is this other pool of talent available? At the end of the day, it’s about risk. A big part of hiring is mitigating risk and most experienced engineers don’t have “clean” resumes. Does too many years at the same company mean a lack of initiative? Does too few years mean they are flighty? Also, their technical experience almost certainly doesn’t completely intersect with the tech stack your company is using, so how quickly can they ramp up?

At the end of the day, I can understand the justification for passing on many of these people. Companies avoid risk in hiring so much that one veto amongst six interviews is enough to disqualify a candidate. This is troubling, however, as we have all been in that candidate’s shoes. At some point in our career, a hiring manager or investor took a risk on us, perhaps despite our previous experience, and maybe that was just the opportunity that we needed. In the happiest stories, we went on to be very successful for the company, justifying the risk. Unfortunately, the stories often don’t have happy endings, which causes companies to write off a large number of candidates.

We believe we can change that and broaden the candidate pool beyond those that fit a narrow description.  Our school is the perfect environment for individuals to demonstrate initiative and the ability to master new skills quickly. Imagine companies bringing interesting projects to the class or challenges like Google’s Summer of Code. Opportunities to collaborate are opportunities to build trust, and trust is the thing that makes referral based hiring so much better than traditional recruiting. I would have gladly exchanged my many hours spent interviewing with time spent mentoring, probably with more productive results. Properly executed, the school could recycle an endless supply of qualified engineers back to grateful companies.

An engineer’s perspective

The only constant is change, and there’s no truer statement in engineering. An engineer’s career is composed of “learning” years and “plateau” years. It’s deadly to stay too long in the “plateau” years. As a result, professional engineers are accustomed to, and take pride in, frequent self-teaching. It’s a good thing that engineers enjoy self-teaching because it’s the only option available currently. It’s certainly not very practical or efficient to go back to college or even get a graduate degree, and community college and other continuing education programs are no match for modern technology. Instead, you’re left sifting through a thousand different eager voices on the internet, most of whom got something working, but also don’t really know what they’re talking about. You sit there, like some kind of technology archeologist, painstakingly piecing together which APIs were thoughtfully designed and which were some historical remnant. Imagine learning how to be a blacksmith by walking into an empty smithy with nothing but a handful of StackOverflow posts. Maybe possible, but very slow.

On the flip side, think back to some of your more fruitful “learning” years. They are probably characterized by two things: a substantial challenge and a significant mentor. In engineering, there are trivial challenges and there are interesting challenges, but both are time consuming for a newcomer. For example, debugging a well-known library issue can take as much time as designing a clever caching strategy. Mentors streamline the trivial challenges by filling in the communal tribal knowledge. Of course, reference texts are also great resources, but by being comprehensive, important nuggets are lost in a sea of trivia. Lack of mentorship breeds great inefficiencies, and this is made obvious when one observes technologists today praising “recent” advances that are simply rediscovered concepts pioneered decades ago.

We believe that our school brings the same advantages as structured mentorship, and allows engineers to challenge themselves throughout their career. Interestingly, it also brings about another important advantage. Collaboration with like-minded peers is so rewarding that it can keep a team together long after the product or engineering challenge loses its luster. However, if you’ve ever tried to chase that feeling by meeting random engineers at a beer-fueled engineering meetup, you’ve probably been disappointed. A learning environment is infinitely more effective at building meaningful collaborative relationships. The relationships may be its own end, may lead to professional collaboration, or may lead to your next cofounder.

CodePath

When we founded CodePath, we wanted to build it on several principles. First and foremost is the standard for excellence. We want to attract the best and that requires the highest quality classes. Second, classes are project-based because engineering is best learned by doing. Third, projects are built in groups because engineering is not a solo activity in industry. To us, product, design, and collaboration are as much a part of engineering as libraries and frameworks, especially in a startup.

Making the classes free is a risky decision. There’s no such thing as a free lunch, and that will make engineers wary of low quality or some kind of catch, so that’s an impression we’ll have to overcome. When we set out to build this school, we were determined to build a school that we would attend ourselves. As a student, I don’t have thousands to spend on classes every year. As long as we can prove that we can train engineers to a certain standard, there’s money on the table from startups and big companies alike.

We’ve already been using our curriculum in corporate training for the past several months, and we’re excited to bring the classes to other engineers. Are you interested? Apply for our iOS evening bootcamp or our Android evening bootcamp. These initial bootcamps are targeted towards experienced developers interested in iOS and Android. If you would like to mentor a group, email us at help@thecodepath.com.

Follow us @thecodepath or join our mailing list for updates.

Asynchronous Processing in Web Applications, Part 2: Developers Need to Understand Message Queues

In the first part of this series, we explained asynchronous processing, when you might need to use it and why leveraging a database for that purpose is not necessarily the best option. In this post, we will explore a smarter approach to asynchronous processing using “message queues”.

Message Queues

Given the reasons that a traditional database is not the right approach for asynchronous processing, let’s take a look at the why a message queue might be a smarter choice. With a message queue, we can efficiently support a significantly higher volume of concurrent messages, messages are pushed in real-time instead of periodically polled, messages are automatically cleaned up after being received and we don’t need to worry about any pesky deadlocks or race conditions. In short, message queues present a fundamentally sound strategy for handling high-volume asynchronous processing.

Before we continue, it is important to mention that I am not claiming that a database can never be used as a queue. In fact, every database can be made to work at a low or medium volume as a queue given enough time and effort of a clever developer. Moreover, there are actually databases such as PostgreSQL that have additional support for queueing and solid libraries that can make this a manageable solution. If you are processing a low to medium volume of messages largely intended to be used as a job queue and you already use postgreSQL to store your data, there are undoubtedly cases where avoiding a separate message queue is the best choice at least until you feel pain. That said a message queue can be quite a bit more versatile, powerful and flexible depending on your needs. With that said, let’s dive into the world of message queues.

What is a message queue?

At the simplest level, a message queue is a way for applications and discrete components to send messages between one another in order to reliably communicate. Message queues are typically (but not always) ‘brokers’ that facilitate message passing by providing a protocol or interface which other services can access. This interface connects producers which create messages and the consumers which then process them. Within the context of a web application, one common case is that the producer is a client application (i.e Rails or Sinatra) that creates messages based on interactions from the user (i.e user signing up). The consumer in that case is typically daemon processes (i.e rake tasks) that can then process the arriving messages. In other cases, various subsystems will need to communicate with each other via these messages.

Features

On top of the fundamental ability to pass messages quickly and reliably, most message queues offer additional complementary features. For instance, multiple separate queues can allow different messages to be passed to different queues. This allows the messages of different types to be consumed by different services. The consumer processing email sending requests can be on a totally different queue (and/or server) then the one that resizes uploaded images. Message delivery behavior will often vary depending on your need. Certain messages will go from one producer to a single consumer (direct) while other times the message is sent to multiple different listening consumers (fanout).

Another critical feature of message queues is robustness and reliability through persistence strategies. In order to keep reliability of the messages high, most message queues offer the ability to persist all messages to disk until they have been received and completed by the consumer(s). Even if the applications or the message queue itself happens to crash, the messages are safe and will be accessible to consumers as soon as the system is operational. In contrast, transient tasks performed synchronously in a web app or on a thread in memory will be lost if anything goes awry. This is especially relevant when dealing with deployments and updating code since restarting components no longer put your messages or tasks at risk.

In addition, message queues can give you a greater visibility into the volume of messages. At the minimum, since there’s typically a broker for messages, inspecting the tasks being processed at any given time is much easier then with synchronous processing. Similarly, handling a high volume of tasks to process is simpler since you can just horizontally scale your message consumers to handle higher loads with minimal fuss. As an added bonus, the application itself now doesn’t have to deal with these tasks, so higher volumes of tasks won’t slow down the response times of the front-end web app.

Service Oriented Architecture

Arguably though the most powerful reason to introduce a message queue is the architectural benefits that their use can afford. In particular, when you have message queues that enable lightweight communication between any number of disparate services, the ability to separate your application into many small subsystems is much easier which will often improve your architecture in several ways including making the individual pieces easier to maintain, test, debug and scale. In addition, with a services oriented approach made possible with message queues, you can easily have multiple teams working along well-defined boundary points and even using different tools or stacks when necessary.

Considerations

All that said, message queues are not a panacea and like all tools have their downsides. Setting up and configuring message queues, especially more complicated ones can add a lot of moving parts to your application. Often in small apps, you don’t actually need to introduce that overhead early on and can instead slowly offload tasks to message queues as the volume of traffic increases over time. Also, with a traditional message queue, error handling is sometimes a very manual effort if a message or task fails and communicating with message queues adds certain complexity into your application logic. In addition, for more general queues, you often have to define your own state machine for messages. That is, a message that needs to be passed to three different services for processing requires you to manually architect a queue workflow that supports that. Nonetheless, for asynchronous processing of many types, a message queue is often the best tool for the job depending on your needs.

Still, which message queue should we use then? What options and alternatives exist? Why would we pick one over another? What are the differences?

Comparing Message Queues

Overview

The good news and the bad news is that there are a lot of message queues to choose from. In fact, there are dozens of message queues with all sorts of names, features, and different pros and cons such as sparrow, starling, kestrel, kafka, Amazon SQS, and many more that will not be discussed here at length.

The easiest place to start is to explain the emerging open standard protocols for message queues which are AMQP and STOMP. These are the most popular message queue standards around today and many of the message queues implement one or both of these protocols. In addition, there is also JMS (Java Message Service) which is widely used on-top of the JVM. For completeness, there is also MSMQ which is not covered in this post but is the defacto queue for most .NET applications.

Message Queue Options

There are certain generic message queue implementations that have emerged as well, many of which built on top of the aforementioned protocols. The most popular and well-supported general purpose message queues available today are RabbitMQ, Apache ActiveMQ and ZeroMQ.

How do these options compare? Well, fortunately all three tools are being actively developed and maintained, have a large user base, good documentation and decent client libraries across many languages. So in a way, you can’t really go wrong with any of these options. However, let’s spend a moment understanding how they compare with one another. In the end, the one to use depends on the needs of your application and the required flexibility.

ZeroMQ

The interesting thing to understand is that ZeroMQ is actually not so much a pre-packaged message queue like the others but instead acts as a framework for building message queues. ZeroMQ focuses mostly on just passing the messages very efficiently over the wire while RabbitMQ acts as a full-fledged ‘broker’ which handles persisting, filtering and monitoring messages. ZeroMQ has no broker built in which means that it does not have a central dispatcher to manage your messages and is really not a “full service” message queue.

Think of ZeroMQ as it’s own toolbox or framework for creating message queues tailored to your own needs. An example is given above for using ZeroMQ to create a basic instant messaging service. RabbitMQ on the other hand tries to be a more complete queue implementation. RabbitMQ is much more packaged and as such requires less configuration and setup overhead for typical use cases.

Of course, with RabbitMQ or ActiveMQ, the broker and persistence built in adds quite a bit of overhead but those libraries choose to sacrifice raw speed to provide a much richer feature set with less manual tinkering. ZeroMQ is an excellent solution when you want more control or just want to do it yourself. In other cases where you just want to use a queue for typical use cases and you are willing to accept the higher overhead, you should consider RabbitMQ or ActiveMQ.

RabbitMQ and ActiveMQ

Comparing RabbitMQ with ActiveMQ is closer to a head-on comparison because they are solving similar problems. However, the differences here are mostly in the details. ActiveMQ is built in Java on the JMS (Java Message Service) and is very frequently used within applications on the JVM (Java, Scala, Clojure, et al). ActiveMQ also supports STOMP which provides support for Ruby, PHP and Python. RabbitMQ is built on Erlang, powered by AMQP and is used frequently with applications within Erlang, Python, PHP, Ruby, et al.

As you might expect, the developers preferences influenced the choices they make throughout the queues. For example, the configuration for ActiveMQ is in XML and the routing of messages is handled with custom rules defined by ActiveMQ. In contrast, the configuration of RabbitMQ is through an Erlang syntax and the advanced routing and configuration follows standard AMQP specifications. The protocols (AMQP vs JMS) used by each queue have certain underlying differences as well. One key difference is that in AMQP a producer sends a message to the broker without knowing the intended distribution strategy while in JMS the producer is aware of the strategy to be used explicitly.

The key takeaway here is that for a general purpose messaging queue to handle all your messaging needs and supporting most advanced requirements, you could use any of the three options listed above. However, my preference and recommendation in most cases for a Ruby or Python web application is to select RabbitMQ on AMQP. Conversely, if you are building on the JVM stack with Scala or Java, then my recommendation would be to use ActiveMQ. In either case, if you are looking for a customizable general message queue, then you won’t be disappointed. Of course, don’t be too quick to dismiss ZeroMQ if you are looking for raw speed and are interested in a light-weight, do-it-yourself protocol for delivering messages.

Specialized Queues

Incidentally, this is not the end of the story though. While message queues are incredibly helpful here, the major ones listed were built to be generic and general purpose. That is, they don’t solve one particular use case but instead support a wide plethora of use cases requiring varying levels of configuration. These ‘industrial strength’ message queues can power chat rooms, instant messaging services, inter-service communication, or even multiplayer online games.

For the average web application though, the requirements are very different. To understand your requirements, ask yourself if you are sending messages to communicate between different services in your application or if you just want to process simple background jobs. In the latter case, we certainly don’t need the most powerful or flexible message queue nor the expensive associated setup costs.

Intra-service vs Job Processing

Most popular web applications really only need a way to do background job processing and offload tasks to an asynchronous queue. These more specific and constrained requirements open up the possibility for a lighter-weight message queue that is easier to use and focused on doing one thing well.

Takeaways

In this article we have covered the features of a message queue, how they work, popular message queue libraries and how to understand if you need a general message queue or a more specialized one. Hopefully you now understand the potential benefits introducing a message queue into your system architecture but also the tradeoffs and overhead of adding it prematurely. In addition, you should be able to see how a message queue works complementary to a traditional database and how queues used appropriately can help improve the modularity, maintainability and scalability of your applications.

In the next part of this series, we will explore a specialized message queue specifically built to process background jobs called a work queue. We will explore how a work queue operates, what features they typically have, how to scale them and which are most popular today. Hope that this introduction was helpful and please let me know what topics or issues you’d like to have covered in future parts of this series.

Asynchronous Processing in Web Applications, Part 1: A Database Is Not a Queue

When hacking on web applications, you will inevitably find certain actions that are taking too long and as a result must be pulled out of the http request / response cycle. In other cases, applications will need an easy way to reliably communicate with other services in your system architecture.

The specific reasons will vary; perhaps the website has a real-time element, there’s a live chat feature, we need to resize and process images, we need to slice up and transcode video, do analysis of our logs, or perhaps just send emails at a high volume. In all the cases though, asynchronous processing becomes important to your operations.

Fortunately, there are a variety of libraries across all platforms intended to provide asynchronous processing. In this series, I want to explore this landscape, understand the solutions available, how they compare and more importantly how to pick the right one for your needs.

Asynchronous Processing

Let’s start by understanding asynchronous processing a bit better. Web applications undoubtedly have a great deal of code that executes as part of the HTTP request/response cycle. This is suitable for faster tasks that can be done within hundreds of milliseconds or less. However, any processing that would take more then a second or two will ultimately be far too slow for synchronous execution. In addition, there is often processing that needs to be scheduled in the future and/or processing that needs to affect an external service.

Synchronous HTTP Illustration

In these cases when we have a task that needs to execute but that is not a candidate for synchronous processing, the best course of action is to move the execution outside the request/response cycle. Specifically, we can have the synchronous web app simply notify another separate program that certain processing needs to be done at a later time.

Asynchronous Processing

Now, instead of the task running as a part of the actual web response, the processing runs separately so that the web application can respond quickly to the request. In order to achieve asynchronous processing, we need a way to allow multiple separate processes to pass information to one another. One naive approach might be to persist these notifications into a traditional database and then retrieve them for processing using another service.

Why not a database?

There are many good reasons why a traditional database is not well-suited for cases of asynchronous processing. Often you might be tempted to use a database because ther can be an understandable reluctance to introduce new technologies into a web stack. This leads people to try to use their RDBMS as a way to do background processing or service communication. While this can often appear to ‘get the job done’, there are a number of limitations and concerns that should not be overlooked.

Asynchronous Processing Model

There are two aspects to any asynchronous processing: the service(s) that creates processing tasks and the service(s) that consume and process these tasks accordingly. In the case of using a database as a method for this, typically there would be a database table which has records that represent notified tasks and then a flag to represent which state the task is in and whether the task is completed.

Polling Can Hurt

The first issue revolves around how to best consume and process these tasks stored within a table. With a traditional database this typically means a service that is constantly querying for new processing tasks or messages. The service may have a database poll interval of every second or every minute, but the service is constantly asking the database if there has been an update. Polling the database for processing has several downsides. You might have a short interval for polling and be hammering your database with constant queries. Alternatively, you could perhaps set a long interval in which case there will be many unnecessary processing delays. In the event of having multiple processing steps, the fastest route through your system has now been delayed to the sum of all the different polling intervals.

DB Polling

Polling requires very fast and frequent queries to the table to be most effective, which adds a significant load to the database even at a medium volume. Even worse, a given database table is typically prepared to be fast at adding data, updating data or querying data, but almost never all three on the same table. If you are constantly inserting, updating and querying, race conditions and deadlocks become inevitable. These ‘locks’ occur because multiple consumers will all be “fighting” with each other over the same table and with the ‘producers’ constantly adding new items. You will see load increase for the database, performance decrease and pile-ups become increasingly common as the volume scales up.

Manual Cleanup

In addition to that, clearing old jobs can be troublesome as well because at even a medium volume, there will be many tasks in the database table. At certain intervals, the old completed tasks need to be removed otherwise the table will grow quite large. Unfortunately, this has to be performed manually and deletes are even often not particularly efficient for tables especially when being removed so frequently in conjunction with updates and queries all happening at the same time.

Scaling Won’t Be Easy

In the end, while a database will appear to work at first as a simple way to send background instructions to be processed, this approach will come back to bite you. The many limitations such as constant polling, manual cleanups, deadlocks, race conditions and manual handling of failed processing should make it fairly clear that a database table is not a scalable strategy for this type of processing.

Takeaways

The takeaway here is that asynchronous processing is useful whether you need to send an sms, validate emails, approve posts, process orders, or just pass information between services. This type of background processing is simply not a task that a traditional RDBMS is best-suited to solve. While there are exceptions to this rule with PostgreSQL and its excellent notify support, there is a different class of software that was built from scratch for asynchronous processing use cases.

This type of software is called a message queue and you should strongly consider using this instead for a far more reliable, efficient and scalable way to perform asynchronous processing tasks. In the next part of this series, we will explore several different message queues in detail to understand how they work as well as how to contrast and compare the various options. Hope that this introduction was helpful and please let me know what you’d like to have covered in future parts of this series.

Continue to Part 2: Developers Need To Understand Message Queues

Adventures in Scaling, Part 3: PostgreSQL Streaming Replication

In my last post in this series, I overviewed how to get PostgreSQL setup, running and properly tuned on your system. This works well to setup a primary database for your system but depending on your needs, you may find yourself in a position where you would like to start taking advantage of the Streaming Replication and High Availability built into PostgreSQL 9.

When running a database such as PostgreSQL on a server, a single primary master database that performs all reads and writes for your applications will likely not be able to sustain your traffic forever. At a certain point, your application will generate too many concurrent reads and writes, you’ll want to be able to recover from database failure, or for whatever reason your database will need help scaling.

Hot Standby? High Availability?

On a high level, the challenge with scaling out a database is a problem of synchronization. With a single database, there is a single source of truth for your application. All records are in one place which acts as the authority for your data. However, once you begin scaling your database, you have multiple servers all responsible for your data. As data is created, updated and destroyed, this information must necessarily be propagated as quickly as possible to all other servers as well. If servers are out of sync, data can be mis-reported or outright lost. Therefore, each database must be in a consistent state with the others with little margin for error. Since no particular approach solves this problem perfectly for all needs, instead there are many strategies for scaling a database.

The approach you should take as load increases has a lot to do with the specific characteristics of your application. Is your application read-heavy? or write-heavy? If your application has many more reads by volume, then the most common strategy is to setup secondary databases that synchronize data with the primary database. This strategy of setting up read-only databases that track changes to the primary is only part of the scaling story. There will be also be cases where you have too many writes for a single box (multi-master) or you need to distribute data across many boxes (sharding), but this post will focus on solving the problem of simply scaling reads. Look for a future post on more advanced scaling and replication strategies.

Fortunately, the strategy for scaling reads is fairly straightforward. The primary database that accepts writes is called the “master” and all other databases that can only help with the reads are called “slaves”. Typically, you can setup as many slaves as you’d like to help distribute the querying load on the database. In addition, in some database setups, these slaves can also help in cases of database/server failure. In the event that the master database fails or becomes unavailable, a slave in a good position can elect to take over as the master so that your application can still accept and retrieve data without downtime. In this scenario, the slaves that can be elected to be masters are called hot standbys and this concept is a strategy contributing towards achieve high availability which means that your database is robust against failure.

Understanding Replication

To understand how to achieve high availability and how multiple PostgreSQL databases keep synchronized, one must understand the concept of “continuous archiving” and “write-ahead logs”. The key concept of high availability has to do with the database being resilient to server failure. Specifically, in the event that one database server goes down, another one should be ready to step up and assume the responsibilities. This secondary server that must be ready to step up at any time is called a “standby”.

In order to achieve this synchronization, the primary and standby servers work together. The primary server is continuously archiving the data coming in or being modified, while each standby server operates in continuous recovery mode, reading the archives from the primary. This “continuous archive” is a comprehensive log of all activity coming into the database (every table created, every row inserted, et al). This comprehensive log is called the “write-ahead log” or WAL and is essential to the replication process.

WAL is described well in the PostgreSQL user’s guide:

Write-Ahead Logging (WAL) is a standard method for ensuring data integrity. WAL’s central concept is that changes to data files (where tables and indexes reside) must be written only after those changes have been logged, that is, after log records describing the changes have been flushed to permanent storage. If we follow this procedure, we do not need to flush data pages to disk on every transaction commit, because we know that in the event of a crash we will be able to recover the database using the log: any changes that have not been applied to the data pages can be redone from the log records.

As logs are being written by the master, these log (WAL) records are then transferred to other database servers that act as standbys. Generally the best way for this process to happen is to streams WAL records to the standby as they’re generated, without waiting for the WAL file to be completely filled. This process of moving the WAL records (in small chunked segments) in this way is known as “Streaming Replication”. One important aspect to note about this process is that transferring happens asynchronously. This means that the WAL records are shipped after a transaction has been committed so there is a small delay between committing a transaction in the primary and for the changes to become visible in the standby. However, “streaming replication” has delays generally under one second unlike older transfer strategies.

Setting This Up

Great, now that we have covered the high-level concepts, let’s jump into how to setup hot standby streaming replication step-by-step. This guide assumes you have a UNIX server provisioned (Debian-based) and have already followed our guide to setup PostgreSQL on that machine.

The first thing to do to setup a standby is to provision another server (preferably same specs as the primary) and follow the steps to setup the basic PostgreSQL server again as similarly to the primary database as possible. Ideally, the standby is essentially a clone of your first server, so if you can duplicate the server, feel free to do that as a time saver. These database servers should ideally be part of the same local network and communicate through their private IPs.

Master Setup

Next, we need to setup your master such that the database can connect to your standby machines in /etc/postgresql/9.1/main/pg_hba.conf:

# /etc/postgresql/9.1/main/pg_hba.conf
host   replication      all             SLAVEIP/32              trust

Be sure to replace SLAVEIP with the correct IP address for the standby. Next, it is a good practice to create the empty WAL directories on both the master and the slaves:

master$ sudo mkdir /var/lib/postgresql/9.1/wals/
master$ sudo chown postgres /var/lib/postgresql/9.1/wals/
slave$ sudo mkdir /var/lib/postgresql/9.1/wals/
slave$ sudo chown postgres /var/lib/postgresql/9.1/wals/

You should also be sure that the master has easy keyless SSH access to the slave:

su - postgres
ssh-keygen -b 2048
ssh-copy-id -i ~/.ssh/id_rsa.pub root@slave.host.com

Now that your master can easily access your slave, you will need to enable streaming replication and WAL logging for your database in /etc/postgresql/9.1/main/postgresql.conf:

# /etc/postgresql/9.1/main/postgresql.conf
# ...
# Listen allows slaves to connect
listen_addresses = 'MASTERIP, localhost'
wal_level = hot_standby
wal_keep_segments = 32 # is the WAL segments to store
max_wal_senders = 2 #  is the max # of slaves that can connect
archive_mode    = on
archive_command = 'rsync -a %p root@slave.host.com:/var/lib/postgresql/9.1/wals/%f </dev/null'

Now create a user that the slave can use to connect to master:

> SELECT pg_switch_xlog();
> CREATE USER replicator WITH SUPERUSER ENCRYPTED PASSWORD 'secretpassword12345';

At this point you should probably restart PostgreSQL:

sudo /etc/init.d/postgresql restart

In order for the standby to start replicating, the entire database needs to be archived and then reloaded into the standby. This process is called a “base backup”, and can be performed on the master and then transferred to the slave. Let’s create a snapshot to transfer to the slave by capturing a binary backup of the entire PostgreSQL data directory.

su - postgres
psql -c "SELECT pg_start_backup('backup', true)"
rsync -av --exclude postmaster.pid --exclude pg_xlog /var/lib/postgresql/9.1/main/ root@slave.host.com:/var/lib/postgresql/9.1/main/
psql -c "SELECT pg_stop_backup()"

This will transfer all the binary data from the PostgreSQL master to the slave. Next, time to setup the standby machine to begin replication.

Standby Setup

First, let’s shutdown the database on the standby:

sudo /etc/init.d/postgresql stop

Let’s enable the slave as a hot standby:

# /etc/postgresql/9.1/main/postgresql.conf
# ...
wal_level = hot_standby
hot_standby = on

and modify the recovery settings:

# /var/lib/postgresql/9.1/main/recovery.conf
standby_mode = 'on'
primary_conninfo = 'host=MASTERIP port=5432 user=replicator password=secretpassword1234'
trigger_file = '/tmp/pgsql.trigger'
restore_command = 'cp -f /var/lib/postgresql/9.1/wals/%f %p </dev/null'
archive_cleanup_command = 'pg_archivecleanup /var/lib/postgresql/9.1/wals/ %r'

One interesting option in the file above that is worth noting is the trigger_file parameter which tells the standby when to become the master. In the event of a machine failure and that file becomes present, the slave will be triggered and become the master. In the event that this standby is one day elected as the master, you need to ensure certain files are in place. Let’s clone the connections listed on master:

# /etc/postgresql/9.1/main/pg_hba.conf
host   replication      all             MASTERIP/32              trust
host   replication      all             SLAVEIP/32              trust

At this point, the slave should be ready to start accepting WAL records and replicating all data from the primary. Let’s restart the database:

sudo /etc/init.d/postgresql start

Verification

Check the logs to make sure that the connection was properly established on the slave (and/or master):

# /var/log/postgresql/postgresql-9.1-main.log
LOG:  entering standby mode
LOG:  streaming replication successfully connected to primary

If you see those two lines, then this means replication has started successfully. If not, follow the error messages and fix the configuration until you can restart and see these lines.

You can also check certain processes on the machines to ensure replication is established:

master$ ps -ef | grep sender
slave$ ps -ef | grep receiver

If both machines have the expected process, then that means that replication is running along happily.

Wrapping Up

At this point, replication should be established on the standby machine and you can repeat this process for any number of additional standbys. Once a standby is established, the WAL records should be continuously streamed to the machine keeping the data synchronized with the primary. In the event of a disconnection, the standby is generally resilient enough to backfill data and catch back up to the primary.

Once the standbys are in place, since they have been configured as ‘hot’ they can actually be used easily as read slaves. Simply setup your application to read from these slaves but only write to the master. If you are using Rails, one good gem to support this concept of standbys is called Octopus.

Hopefully you have found this post useful as a straightforward guide to understanding and setting up hot standby streaming replication. In a future post, we hope to cover more advanced replication strategies as well as a detailed guide from how to do a large data migration from MySQL to PostgreSQL. If you run into any problems or spot errors in this guide, please let us know so we can update the steps.