Today, I attended the first day of QCON New York 2014 Conference. Here is a brief introduction of the conference:
Software is Changing the World. QCon empowers software development by facilitating the spread of knowledge and innovation in the developer community. A practitioner-driven conference, QCon is designed for technical team leads, architects, engineering directors, and project managers who influence innovation in their teams.
QCon starts with 2 days of tutorials on Monday and Tuesday, June 9-10 followed by the full 3-day conference from Wednesday, June 11-13. The conference will feature over 100 speakers in 6 concurrent tracks daily covering the most timely and innovative topics driving the evolution of enterprise software development today. The setting is the beautiful, centrally-located Marriott at Brooklyn Bridge in New York City.
The main reason I attend this conference is to know the industry trend in the software and service development, learning more people use bigdata and machine learning techniques in there services and find existing Microsoft customer and understand their user cases.
Here is the highlight of day 1 of the conference. In Day one, we have 6 sections running concurrently, they are
Hot Technologies behind Modern Finance
The Hyperinteractive Client
Lean Product Design
Architectures you've always wondered about
I am mostly involved in Continuous Delivery and Architectures you've always wondered about sections. I will share the slides and talks (if possible) in later days
Application monitoring and NoSQL/other database techniques are dominate booths, nearly all booths are related to one of the techniques, which is also a
good indicator of what is hot in industry
I knew AppDynamic from the popular slides "Call of Duty: Dev Ops". AppDynamic is very popular company in term of monitoring customer's application. Their application performance management (APM) allow people to monitor hybrid environemnts with Java, .Net and PHP (and a wide list of environments). The people showed
me a demo about how they monitor oracle database and do performance troubleshooting on wait stats, poor query plans, etc. Which remind me that our SQL Azure's performance monitoring is not as great as we should.
Riak: Key value store Yammer use it. They are talking with Azure to hosting on Azure. Moving to seattle soon.
GridGain: In memory computing (HPC), in memory Streaming, Im memory Accelerator for Hadoop, In-memory database
Tibco ActiveSpaces: a distributed peer-to-peer in memory data grid
Today's keynote is Whither Web programming? from Gilad Bracha Co-author of the Java Spec. Gilad discussed several tools which can interactive compile and see the result in browerse. Since I am not frontend expert, I just list the tools in here:
You can modify code online and syntax online. He mentioned Reflection: reflection makes your output larger
Has a live debugger, (IDE),
serializing a thread,
continuation to flow…
Live code editing + Mixins
Changing a mixin at runtime means changing all classes that mix it in.
His final thought:
Whither web programming:
Foursquare: Involving from check-in to recommendation engineering
Current the company has 140 employees. Highlight:
Part 1: scale the data storage
Started in 2009, MySQL switch to postgresql during to requirement. They use typical with of scale out: indexing, memory cache and when things becomes big, they do either Split tables so that replacing joins with several queries or 2) Replication to read-only and redirect traffic.
However, they are facing Outgrowing our hardware:
So they decide to use Shading by evaluating the following options:
They selected Mongo due to
From 2010 to 2011, migrate one table at one time to Mongo. 15 clusters, Peak take 1million query per second.
Beside Mongo, they use , they use Memcache, Elastic search for Nearly venue search and User search. They also build two data service:
Read-only key value server (Hfile) is a file index service. They use nightly map reduce jobs to generate Hfile: prefix index files to pre-compute common used query results.
Use zookeeper to tie Hadoop cluster to run these jobs. Caching services on top of mongo to avoid try something is very expensive for mongo
Part 2: application complexity
In 2009 using PhP, the company use Scala, and then shift to Java using programs named Lift.
One interesting tools is called RPC tracing: API explorer issues. Most inexpensive tools to get API insight (DB connection, performance and troubleshooting and stack traces). RPC counts past week per API: if the increase RPC calls, it means something wrong.
Another tool is called Throttles: dynamic switch on/off the feature. Turn on features on ids, internal users, etc with different rules. Used for rollout new features as well.
Remember the goats, i.e., the grow pain as developers:
So the Solution: SOA infancy
But the team still face following problems:
The solutions are:
such as ./service_releaser -j servier_name
Distributed tracing tools
Send all traces to kafka queue to summary the traces
Each application pass correlateate id from parent dwon to the children
All aggregation was published to single slot and see full stacks every easily
Use Zookeep + Finagle server sets to dynamic handle hostnames, etc
Fast failing RPC calls after some error rate threshold
Loosely based on Netflix's hystrix
Smaller teams owning front to back implementation of a features
Desire to have quick deploy cycles on new API end points
Wouldn't it be cool if a developer could expose new API without reploy new packages
Some libraries register the endpoints thought zookeeper. Thrift .
Take minutes for dev to have a new API running on official site by using proxy to redirect traffic to new API
Benefit: Tight contract for service interaction
Clear path to breaking off more chunks from API monopolistic
Future works; part 3:
Migrating to cloud native with micro-services by Adrian Cockcroft
Adrian recently left Netflix to help IT industry to adopt the practices built from Netflix.
Here is the highlight of learning from Netflix
Question: rapid change with latest year/6 month, co-operate IT was learning cloud, up to speed.
I did not record all notes, so here is some of the notes I recorded for your reference. He talked a lot for micro-service, which seems very popular recently. I will share the
slides and talks when it available.
Disruptors: take what used to be expensive learn to "waste" them to save money somewhere
Example1: Solid state disk: Past: assume random reads are expensive , Now: RR is free, immutable writes, log-merge
SSD packaging as disk, as PCI card, as memory storage.
Cloud native storage architecture (don’t build SSD build distribute system, but embedded into Hadoop machine).
Linear scale up
Hundreds of nodes per cluster in common use today
Thousands of nodes per cluster are tested.
Example2: Non-Cloud product development as an example
hardware provision is un-differential heavy lifting -replace with IASS. IASS based product development allow you develop in weeks, However, SASS can allow
You develop in days.
The difference with bigdata with bi is it answering unplanned questions in hours.
Open Space discussion on Continuous delivery
I attend the open space discussion on CD. There are two topics I am involved, which are both testing related. I guess how testing strategy fit into the overall delivery pipeline
In the whole service is still an open question which struggled many people.
Automatic Performance testing on complex system
In this session, one developer describe his problem. His team has very good testing strategy, focus a lot on unit testing, and some of the integration testing, and a couple of end to end acceptance testing with UI automation. He is worry about performance regression and want to see whether we can test performance in a cheap way. There are a couple of ideas from different people:
I explained the three Ds in Yammer team: Dark Release, Dogfood and Data Insight. Suggest that we turn feature off by default to reduce risk, use dogfood to test your feature and build rich telemetry for your KPIs of the system.
Automatic v.s Manual
One game company have many manual testers in QA department to test new releases which release every 6 month. One guys suggest that try to reduce (or remove all) the tester and invest more test automation. The people who has this question also mentioned that organization is the main issues since you are working with different departments with different point of views. Again, I suggest they do A/B testing, even they ship client applications to customer, they might can still do A/B testing to turn feature on/off and also do more dogfood and telemetry
Google backup cloud from Raymond Blum
Not very interesting talks, but his a couple of high light is very important for us
TestOps for Continuous Delivery
Acquia Cloud is PASS for PHP apps, it has
Obligatory impressive numbers, 03/2014
Release every 1.6 day per day on average
Our customer hate downtime. The main issues are Server configuration is a software, and it is hard to test. Puppet, Chef can assist you, but you still need invest lot of testing on this. Problem: reality is very mess!: you might have launch failures or race conditions.
Unit test v.s. system tests
System tests are for end to end
Apply code changes to real, running services
Exercise the infra as the apps will
System tests FTW:
For infrastructure, system tests are essential
Test in a clone of production is not right:
Backup and restores
Load balancing with up and down workload
ELN health check and recovery
Monitoring and alerts
Manage must accept that infrastructure system tests are :
Under-investing will bite you badly.