Cliff Click presented recently at Shopzilla here in Santa Monica. Cliff, of Azul Systems, presented “A JVM Does What?”. About 60 folks from the Santa Monica Java Users’ Group turned out to participate.

Cliff Click Presents “A JVM Does What”

SM-JUG Cliff Click Feb 17 2011 from Shopzilla on Vimeo.

Comments (0)

SMJUG: Tech Talk with Cliff Click

Posted by Thursday, February 10th 2011

Come on out to Santa Monica and join us for the premiere Santa Monica Java Users’ Group. Our inaugural event will feature on a tech talk from Cliff Click. The first SMJUG will be hosted at Shopzilla.com on Thursday, February 17th at 7:00pm, and is open to aficionados of the Java language.

Cliff Click

Cliff Click is Chief JVM Architect and Distinguished Engineer at Azul Systems, an enterprise compute appliance vendor providing scalability, manageability, and real-time monitoring for Java technology-based applications. He holds a Ph.D. in computer science from Rice University and is hosting sessions on benchmarking, alternative languages on the JVM, and how modern architectures impact Java applications.

About Santa Monica Java Users’ Group

The Santa Monica Java Users’ Group is focused on building a community for Java engineers in Los Angeles, and is hosted in Santa Monica. SMJUG is a collaborative initiative sponsored by Shopzilla, Edmunds and eHarmony.

Comments (0)

Fire Chief

Posted by Friday, March 12th 2010

“There is a production problem!! All hands on deck!!”

Really? Do we actually need to involve every member of the team? I say, probably not.

On a team with myriad legacy systems, production problems will often be a significant burden for the team. In my experience, without a strategy for managing the team’s approach to tackling these production ‘fires’, the team’s yield for new value creation will be far below it’s potential.

Without compromising the core tenet of a cross-trained team, how does production support work in a agile environment? How is a “production support” role best framed? While the first step may be acknowledging the need for a strategy, the real “magic” in making your strategy work comes from teams’ ownership of the need.

Fire!

Who doesn’t love a Fire Chief? My little nephew is obsessed with fire trucks, and would probably trade cookies for a trip down to the fire station! And isn’t the Fire Chief the bravest and smartest fire fighter of them all?

With few exceptions, it generally doesn’t take an entire team to fight a fire. The “all hands on deck” approach to production issues is often born from a misguided – albeit well-intentioned – desire to resolve an issue quickly in order to get back to work on the “new stuff”. Best case, this approach may optimize a single engineer’s time-to-resolution at the cost of lowering the overall yield of the team. Worst case, “drop everything” is a trained response designed as much to create the appearance of motion as any real progress.

Our approach has been to define a new role – the Engineering Fire Chief. Simply put, our Fire Chief is an engineer (or two) who actively accepts the role of providing distraction-free “cover” for their team. (yes, we actually bought a Fire Chief hat.)

While the team is working away on creating net-new value for our business, the Fire Chief’s duty is to put out the production fires, so that the team can focus fully on their stories without distraction. After a couple of iterations worth of assessing and adjusting the Fire Chief position to make it a success, here are the key results of what we’ve learned:

Rotation

Whether or not to rotate the Fire Chief role amongst all team members was an easy decision. It’s something I’m passionate about, but more importantly, it’s something the team is passionate about. After all, our software belongs to all of us. We started a rotation from the moment the Fire Chief role was coined (and we had acquired the hat). The hat, while being a bit silly (fun?), creates a informal but important “hand-off” of the responsibility of running our technology from Fire Chief to Fire Chief. It truly is a relief to hand it off, but we all have a laugh about it too, which makes our little ceremony fun.

Rotation Frequency

We experimented with a variety of alternatives, from 1 week, to 2 weeks, to 1 month. Our trial and error approach was really interesting, and over time has taught us a lot about what the Fire Chief role is really all about. Ultimately we settled on 2 weeks (the same length as our Iteration). The Iteration start and end is a natural breaking point for the Fire Chief rotation, and we’ve created a recurring forum for our business partners when the new Fire Chief is handed the hat. In this forum we focus on lingering issues and/or minor features or fixes which need to be applied to the legacy world. The new and retiring Fire Chief also engage in a hand-off discussion.

Cover Man Means Cover Man

Somehow the team had been ingrained with the mentality that some problems required the attention of the entire team (per this article’s first quote). During one particular iteration, our feed processing pipeline was experiencing a slow down. After several days of slow down, and no solution in sight, the risk became large enough, that the entire team was ‘required’ to dive in. All of our new product development (the stories in our iteration) ground to a halt. When we eventually found the problem and fixed it, we determined as a team that it didn’t actually require the entire team’s attention to find the root cause. This was perhaps one of our more valuable retrospectives. To summarize, while it took us a while to notice cause and effect, we eventually noticed that when the entire team jumped on a production problem, our velocity for the iteration was drastically impacted.

At large, this helped us refine the Fire Chief framing as a ‘cover man’. Prior to this, and with the Fire Chief accounted for in our normal iteration capacity, Fire Chief’s were scrambling to resolve problems, and in doing so, were involving other team members. A Fire Chief would involve another team member in order to solve the production problem as quickly as possibly, in order to get back to the stories in the iteration. In involving other team members, the Fire Chief was unintentionally impacting the broader team’s ability to deliver. The Fire Chief was accidentally hurting the team’s velocity.

The above distilled, we simply no longer count the Fire Chief’s capacity in the iteration. As a result, the person in this role is less pressured and is free to discover root causes and develop improvements without impacting or distracting other team members. The Fire Chief is truly a cover man.

Make the Production World Better for the Next Fire Chief

What good is a rotation of Fire Chiefs who are focused on band-aids, where no improvement of our legacy world occurs? It took some time, but by creating the right mix of rotation, rotation frequency, and cover-man-means-cover-man mentality, we’re now able to improve the legacy world with each Fire Chief’s tenure. Not counting the Fire Chief’s capacity in the iteration, means that the Fire Chief can spend time to find and fix problems at their root cause. One recent Fire Chief exposed new and interesting real-time performance metrics via JMX, and added visualizations of the metrics to our internal dashboards. This dash acts as a real-time window into performance problems, and improves every subsequent Fire Chief’s ability to visualize and problem solve.

~

How do you deal with “production support”? How are you able to evolve your systems and services while maintaining uptime and SLAs for existing legacy systems?

Tags: , , , , , , , , ,

Comments (0)

I am a Lead Engineer

Posted by Wednesday, February 3rd 2010

Continuing on our “I am” riff, Petter and I decided to frame our Lead Engineer role here at Shopzilla.

I am a Lead Engineer:
  • I hold “excellence” as my yardstick.
  • I am a software craftsman, I am proud of our code, and I promote this craftsmanship and pride amongst other engineers.
  • I lead by example, and I love getting my hands dirty.
  • I love to learn and teach through conversations about code.
  • I favor reuse over reinventing the wheel.
  • I am passionate about performance and know there is no such thing as “perf spray“.
  • I engage other engineers, and in seeking them out I foster communication.
  • I heart QA.
  • I ensure my decisions are based on not just my own competence, but that of the whole team.
  • I am a mentor, and I understand the difference between posing questions and mandating solutions.
  • I am approachable.
  • I encourage my teams to develop their own technical voice.
  • I ensure my engineering team is not an island – our discoveries are shared; our struggles are too.
  • I help strike the balance between quality and getting functionality out the door.
  • I help business owners translate ideas into technical solutions.
  • I find the bumps in the road before we hit them.

I am a lead engineer and my success is the sum of the success of the people around me.

Comments (0)

Leveraging Archetypes

Posted by Thursday, October 1st 2009

Creating a new web-service should be a reasonably simple operation.

During the rebuild of our Bizrate and Shopzilla websites we found that we built a lot of similar web services:  Simple read access to enriched data sourced from other web services and databases with XML payloads.  Each service would also support a variety of common endpoints for health checks, configuration information and include a standard set of MBeans for operational monitoring.

We invested in a Maven Archetype to create a template for developing a new service.

From a Configuration Management perspective all our services have a consistent approach to builds, configuration and deployment.  From an Engineering perspective we have consistent project structures, technology choices and best practices.  For example, connection pooling strategies and libraries, layout of maven projects including separation of concerns between Maven modules, publishing project documentation via Continuous Integration, environment agnostic archives, and code coverage checks.

An example of a command to generate a new service from an archetype:

mvn archetype:generate -DarchetypeGroupId=com.shopzilla.archetypes \
-DarchetypeArtifactId=shopzilla-modular-hibernate \
-DarchetypeVersion=2.0-SNAPSHOT \
-DgroupId= \
-DartifactId= \
-Dversion= \
-Dservice-version= \
-Dshortened-service-name= \
-Duser= \
-DcurrentYear=

This produces the source code for a ready-for-Git, ready-for-Hudson, web-service, with 90% code coverage checks with all the standard endpoints, MBeans plus an example endpoint that integrates with a database via Hibernate.

We continually contribute features back to the archetype to ensure that new services will keep pace with any new technology decision.

At the moment our Maven Archetype makes use of Apache Velocity in order to apply naming decisions, such as Java packages and the web-service name, to the generated code base. We’re keeping a close eye on Spring Roo in order to understand how our Archetype can continue to evolve and improve.

Having homogeneous projects also helps further our goal of enterprise Collective Code Ownership.  Its very easy to checkout and build any service and be instantly familiar with how its engineered.

Comments (2)