Help, I’m clueless about Web Service scalability

Robert Scoble Uncategorized October 5, 2008 1 Minute

I’m really freaked out. I have one of the biggest interviews of my life coming up and I’m way under qualified to host it.

It’s on Thursday and it’s about Scalability and Performance of Web Services.

Look at who will be on. Matt Mullenweg, founder of Automattic, the company behind WordPress (and behind this blog). Paul Bucheit, one of the founders of FriendFeed and the creator of Gmail (he’s also the guy who gave Google the “don’t be evil” admonishion). Nat Brown, CTO of iLike, which got six million users on Facebook in about 10 days.

All three have faced huge scalability problems head on. All three are developers and architects who actually have built systems that have built great reputations online.

I’m totally out of my league and as I do more and more research on the topic I realize just how out of my league I am.

But, one good thing about doing stuff online is that:

1. I can admit I’m over my head and get help from you.
2. I just need to know enough to be dangerous to get a conversation going between these three guys.
3. I’m not the only interviewer here. You will take over and fill in the pothole in my own knowledge (we’ll get you involved via Skype).

It’s free. It’s open to you.

So, since I’m clueless about the topic, what would you ask these guys about how to build scalable and performant Web services, especially given that tomorrow’s services are probably going to be glued together from a variety of services?

Oh, and thanks to Rackspace for sponsoring this webinar (we’re filming it at the excellent Revision 3 studios in San Francisco).

Published by Robert Scoble

I help entrepreneurs build their technology business' story, help with getting ready for investors, with other launch plans, and many other strategic things that can help your new startup. Call to talk: +1-425-205-1921 (text first). View all posts by Robert Scoble

Published October 5, 2008

104 thoughts on “Help, I’m clueless about Web Service scalability”

Palaniappan C says:

October 6, 2008 at 6:53 am

At which point in the process of building your app do you think about scalability? For what kind of apps is it important that scalability is taken into consideration from the get go, when you’re making your data model? And for what kind of apps is it not all that critical?

Also, how difficult is it to alter your data model and architecture without screwing up when you are live? Does everyone have to get an expensive DBA? Or is it mostly common sense?

LikeLike
Palaniappan C says:

October 5, 2008 at 11:53 pm

At which point in the process of building your app do you think about scalability? For what kind of apps is it important that scalability is taken into consideration from the get go, when you’re making your data model? And for what kind of apps is it not all that critical?

Also, how difficult is it to alter your data model and architecture without screwing up when you are live? Does everyone have to get an expensive DBA? Or is it mostly common sense?

LikeLike
Eric Lorentz says:

October 6, 2008 at 6:54 am

I would ask them how they develop their databases functions.

I usually develop in stages, with the first stage being very bloated with repetitive functions and calls. After I get all of my functionality in order I work through everything and consolidate functions to make them as dynamic as I possibly can. I’ve tried to plan things on paper ahead of time, but it never works out.

Do they have a development process that addresses this from the beginning, or do they go through the same stages?

LikeLike
Eric Lorentz says:

October 5, 2008 at 11:54 pm

I would ask them how they develop their databases functions.

I usually develop in stages, with the first stage being very bloated with repetitive functions and calls. After I get all of my functionality in order I work through everything and consolidate functions to make them as dynamic as I possibly can. I’ve tried to plan things on paper ahead of time, but it never works out.

Do they have a development process that addresses this from the beginning, or do they go through the same stages?

LikeLike
Eric Florenzano says:

October 6, 2008 at 6:57 am

It seems like the pain point of everybody’s scalability right now lies in the database layer. Did you use traditional methods of scaling the database (clustering, replication, etc.) or have you turned to alternatives like CouchDB and HBase, and what have been the benefits/drawbacks of your decisions in that area?

LikeLike
Eric Florenzano says:

October 5, 2008 at 11:57 pm

It seems like the pain point of everybody’s scalability right now lies in the database layer. Did you use traditional methods of scaling the database (clustering, replication, etc.) or have you turned to alternatives like CouchDB and HBase, and what have been the benefits/drawbacks of your decisions in that area?

LikeLike
Palaniappan C says:

October 6, 2008 at 6:58 am

Also, how effective do you think Amazon’s web services are at being a worry-free solution for this problem? What kind of apps make sense on this kind of database?

Thanks!

LikeLike
Palaniappan C says:

October 5, 2008 at 11:58 pm

Also, how effective do you think Amazon’s web services are at being a worry-free solution for this problem? What kind of apps make sense on this kind of database?

Thanks!

LikeLike
ecokind says:

October 6, 2008 at 7:21 am

Holy smokes! That’s a bigtime interview! Looks like you have some good answers generating.

Have a great time!
Amy Woidtke
green interior decorator
Seattle, WA

LikeLike
ecokind says:

October 6, 2008 at 12:21 am

Holy smokes! That’s a bigtime interview! Looks like you have some good answers generating.

Have a great time!
Amy Woidtke
green interior decorator
Seattle, WA

LikeLike
mike ashworth says:

October 6, 2008 at 7:28 am

Hi Robert,

I’ve picked up on one of the words you just used “glue”. Perhaps ask them how they can build a credible, trusted Brand when they glue bits of applications together to create something for the end-user.

How, as a Brand will they cope when these services fail, how will they manage that Brand Experience. Sure, a lot of these products are free to the end user yet that doesn’t mean that they should receive a shoddy or poor or failing service.

If they choose free services in the cloud from places like Amazon and Google with no real uptime guarantees, how can they manage that?

Lots of people love twitter, yet people are having to now change their ways to use it well, de-follow, use Friendfeed, Twitter turned off sms in UK as well. All these are actually people problems and not technical problems. They impact upon the way people go about their lives.

I’d also be interested in how they plan (or if they can at all) for people using the product in ways they never imagine that impact upon their initial scope? That’s kinda what happened to Twitter as well.

Mike Ashworth
Marketing Coach and Consultant
Brighton and Hove, Sussex, UK

LikeLike
mike ashworth says:

October 6, 2008 at 12:28 am

Hi Robert,

I’ve picked up on one of the words you just used “glue”. Perhaps ask them how they can build a credible, trusted Brand when they glue bits of applications together to create something for the end-user.

How, as a Brand will they cope when these services fail, how will they manage that Brand Experience. Sure, a lot of these products are free to the end user yet that doesn’t mean that they should receive a shoddy or poor or failing service.

If they choose free services in the cloud from places like Amazon and Google with no real uptime guarantees, how can they manage that?

Lots of people love twitter, yet people are having to now change their ways to use it well, de-follow, use Friendfeed, Twitter turned off sms in UK as well. All these are actually people problems and not technical problems. They impact upon the way people go about their lives.

I’d also be interested in how they plan (or if they can at all) for people using the product in ways they never imagine that impact upon their initial scope? That’s kinda what happened to Twitter as well.

Mike Ashworth
Marketing Coach and Consultant
Brighton and Hove, Sussex, UK

LikeLike
Robert Scoble says:

October 6, 2008 at 7:32 am

There are more comments on this post over on FriendFeed, including some good suggestions: http://friendfeed.com/e/ad1a6d75-56de-1139-792e-2d6fa006d839/Help-I-m-clueless-about-Web-Service/

LikeLike
Robert Scoble says:

October 6, 2008 at 12:32 am

There are more comments on this post over on FriendFeed, including some good suggestions: http://friendfeed.com/e/ad1a6d75-56de-1139-792e-2d6fa006d839/Help-I-m-clueless-about-Web-Service/

LikeLike
David Sifry says:

October 6, 2008 at 7:35 am

Robert,

Come up and visit us in San Francisco this week – having gone through a LOT of scalability issues with the growth of Technorati, I can give you lots of grounding on the basics, so you won’t feel uncomfortable with these very smart guys. There’s a set of basic rules and principles that help out in building scalable systems – but it’d take an entire book to really talk about them all in detail. Come on up and visit – and if you want, I’ll give you a sneak peek into the depths of Offbeat Guides, and how we’re building to scale as well… 🙂

You’ve got my number/email, drop me a line. It’d be great to catch up as well!!!

Dave

LikeLike
David Sifry says:

October 6, 2008 at 12:35 am

Robert,

Come up and visit us in San Francisco this week – having gone through a LOT of scalability issues with the growth of Technorati, I can give you lots of grounding on the basics, so you won’t feel uncomfortable with these very smart guys. There’s a set of basic rules and principles that help out in building scalable systems – but it’d take an entire book to really talk about them all in detail. Come on up and visit – and if you want, I’ll give you a sneak peek into the depths of Offbeat Guides, and how we’re building to scale as well… 🙂

You’ve got my number/email, drop me a line. It’d be great to catch up as well!!!

Dave

LikeLike
epiquest says:

October 6, 2008 at 7:43 am

Schoobie!……I can’t believe that there’s ‘any’ on-line subject you’re not well versed in, (but I love the ‘magnetic’ tag).

This is a work of art and you must be congratulated. It’s no wonder you’re everywhere we turn.

Good luck with your interview, I’m sure it’ll be a great success.

Pete.

LikeLike
epiquest says:

October 6, 2008 at 12:43 am

Schoobie!……I can’t believe that there’s ‘any’ on-line subject you’re not well versed in, (but I love the ‘magnetic’ tag).

This is a work of art and you must be congratulated. It’s no wonder you’re everywhere we turn.

Good luck with your interview, I’m sure it’ll be a great success.

Pete.

LikeLike
Eric says:

October 6, 2008 at 7:53 am

I’d like to know: what metrics do you consider key to determining whether your system is scaling well? Do you run a consistent set of requests through and measure latency over time? Do you look at aggregate loads or throughput?

How do you think about ROI for time and energy invested in making your architecture more scalable?

LikeLike
Eric says:

October 6, 2008 at 12:53 am

I’d like to know: what metrics do you consider key to determining whether your system is scaling well? Do you run a consistent set of requests through and measure latency over time? Do you look at aggregate loads or throughput?

How do you think about ROI for time and energy invested in making your architecture more scalable?

LikeLike
danielmcvicar says:

October 6, 2008 at 8:15 am

Hi Robert
I would use your lack of knowledge about scalability as an asset…you are a curious person, and can help someone like me who wants to learn about scalability through you.

It is sometimes better not to know too much so that you can find out.

So find out for me!
D

LikeLike
danielmcvicar says:

October 6, 2008 at 1:15 am

Hi Robert
I would use your lack of knowledge about scalability as an asset…you are a curious person, and can help someone like me who wants to learn about scalability through you.

It is sometimes better not to know too much so that you can find out.

So find out for me!
D

LikeLike
Simon says:

October 6, 2008 at 8:28 am

“We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.” Knuth, Donald

Discuss.

Do you build an application and make it scalable, or do you build a scalable application from the beginning?

Is scalability more important than functionality sometimes?

Simon.

LikeLike
Simon says:

October 6, 2008 at 1:28 am

“We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.” Knuth, Donald

Discuss.

Do you build an application and make it scalable, or do you build a scalable application from the beginning?

Is scalability more important than functionality sometimes?

Simon.

LikeLike
Jon Hancock says:

October 6, 2008 at 8:37 am

Scalability is another word for saying “how do you engineer something well”. If you’re a software engineer with enough experience, the answer is that making something scalable has no silver bullet. You just “write good software”. So what does that mean?

In part it means, encapsulating your design and code at a mid to low level of granularity. If you are strict with encapsulation techniques (following pure OO and message passing design is a solid method) it doesn’t matter what languages or libraries o databases you use in each part as you simply rewrite/replace components as they become bottlenecks.

This approach allows you to get a full system bootstrapped without need to optimize components ahead of time.

The whole point of a web service is that you have a simple API and others don’t care about how you implement internals. So taking an API-centric design approach is key.

LikeLike
Jon Hancock says:

October 6, 2008 at 1:37 am

Scalability is another word for saying “how do you engineer something well”. If you’re a software engineer with enough experience, the answer is that making something scalable has no silver bullet. You just “write good software”. So what does that mean?

In part it means, encapsulating your design and code at a mid to low level of granularity. If you are strict with encapsulation techniques (following pure OO and message passing design is a solid method) it doesn’t matter what languages or libraries o databases you use in each part as you simply rewrite/replace components as they become bottlenecks.

This approach allows you to get a full system bootstrapped without need to optimize components ahead of time.

The whole point of a web service is that you have a simple API and others don’t care about how you implement internals. So taking an API-centric design approach is key.

LikeLike
Bob Dole says:

October 6, 2008 at 9:34 am

I’d take a 3 part approach:

1) Start with something very restrictive. App Engine is somewhat close to where you should start. You get .2 seconds to handle each request, you have no state on the server, you can’t do joins or complex queries on the database, and you can’t persist anything outside of the database. At no point should you assume that any two requests will hit the same front end server, and at no point should you assume that any two database entries will be on the same DB server.

2) Ignore everything in step 1 *when necessary for your application*. This important request needs to do a lot of heavy lifting and a few joins? Fine. You need to do a database query that takes 2 seconds on an empty database for a feature of marginal use? No. Make your application unscalable if it is necessary, but only as a conscious choice.

3) Now you’ve hit the big time, and your system is melting. Add front end servers and throw session info into the database. Federate and replicate the heck out of the database, and do everything you can to get rid of the expensive queries you added in step 2. Generally, this will involve de-normalizing and caching as much as possible. Add caching at every point in your app. You should be able to hire people to help at this point.

LikeLike
Bob Dole says:

October 6, 2008 at 2:34 am

I’d take a 3 part approach:

1) Start with something very restrictive. App Engine is somewhat close to where you should start. You get .2 seconds to handle each request, you have no state on the server, you can’t do joins or complex queries on the database, and you can’t persist anything outside of the database. At no point should you assume that any two requests will hit the same front end server, and at no point should you assume that any two database entries will be on the same DB server.

2) Ignore everything in step 1 *when necessary for your application*. This important request needs to do a lot of heavy lifting and a few joins? Fine. You need to do a database query that takes 2 seconds on an empty database for a feature of marginal use? No. Make your application unscalable if it is necessary, but only as a conscious choice.

3) Now you’ve hit the big time, and your system is melting. Add front end servers and throw session info into the database. Federate and replicate the heck out of the database, and do everything you can to get rid of the expensive queries you added in step 2. Generally, this will involve de-normalizing and caching as much as possible. Add caching at every point in your app. You should be able to hire people to help at this point.

LikeLike
Tuomas Tanner says:

October 6, 2008 at 3:16 am

Do you think the new more “sexy” interpreted languages such as Python, Ruby or PHP are more difficult to scale than the arguably faster compiled languages such as .NET and Java?

Are there some features in a language / framework that make it better suited for scaling?

Which language / platform would be ideal for building a robust easily scalable web application?

LikeLike
Tuomas Tanner says:

October 6, 2008 at 10:16 am

Do you think the new more “sexy” interpreted languages such as Python, Ruby or PHP are more difficult to scale than the arguably faster compiled languages such as .NET and Java?

Are there some features in a language / framework that make it better suited for scaling?

Which language / platform would be ideal for building a robust easily scalable web application?

LikeLike
Pingback: BuzzGain » 5 simple tips of hosting a great blogger event
Fredrik Wennberg says:

October 6, 2008 at 10:17 am

I would like to know;

* What technical infrastructure scalability models did they try (horizontal, vertical or perhaps a combination?
* Scalability and resilience often go hand in hand… was resilience considered to be a known factor or a bonus when they decided to scale up or scale out?
* Have they encountered any geographical scalability problems?
* On what OSI-level have they concentrated the most, when trying to solve scalability issues?
* How are they pro actively monitoring performance problems and on what OSI-levels are they monitoring?
* From a scalability perspective “the cloud” can look like a good idea, but what are their thoughts on resilience in “the cloud”?

Fredrik Wennberg
IT Solution Architect

LikeLike
Fredrik Wennberg says:

October 6, 2008 at 3:17 am

I would like to know;

* What technical infrastructure scalability models did they try (horizontal, vertical or perhaps a combination?
* Scalability and resilience often go hand in hand… was resilience considered to be a known factor or a bonus when they decided to scale up or scale out?
* Have they encountered any geographical scalability problems?
* On what OSI-level have they concentrated the most, when trying to solve scalability issues?
* How are they pro actively monitoring performance problems and on what OSI-levels are they monitoring?
* From a scalability perspective “the cloud” can look like a good idea, but what are their thoughts on resilience in “the cloud”?

Fredrik Wennberg
IT Solution Architect

LikeLike
Robert says:

October 6, 2008 at 12:30 pm

There are a number of tools out there for monitoring the efficiency and effectiveness of the web services layer, which, if any, do they use and why?

Amberpoint is the one I’m most familiar with. (www.amberpoint.com). I’d love to know if there are others that are different/better.

LikeLike
Robert says:

October 6, 2008 at 5:30 am

There are a number of tools out there for monitoring the efficiency and effectiveness of the web services layer, which, if any, do they use and why?

Amberpoint is the one I’m most familiar with. (www.amberpoint.com). I’d love to know if there are others that are different/better.

LikeLike
Hillel Glazer says:

October 6, 2008 at 1:03 pm

I’d definitely include a question of how the issue of scalability “showed up” for each of them. It is likely different.
– What were the symptoms?
– What part of the architecture or system was the most limiting factor?
– What did they do about it that wasn’t just volume/capacity adjustments with more boxes?
– What was the most innovative thing that was created to deal with scale? Did they make any new/interesting algorithms, techniques, methods, etc.?
– They’re probably not done dealing with scalability, what are their ongoing efforts to improve?
– Was scalability always/only about technology? What about processes and people scalability?

LikeLike
Hillel Glazer says:

October 6, 2008 at 6:03 am

I’d definitely include a question of how the issue of scalability “showed up” for each of them. It is likely different.
– What were the symptoms?
– What part of the architecture or system was the most limiting factor?
– What did they do about it that wasn’t just volume/capacity adjustments with more boxes?
– What was the most innovative thing that was created to deal with scale? Did they make any new/interesting algorithms, techniques, methods, etc.?
– They’re probably not done dealing with scalability, what are their ongoing efforts to improve?
– Was scalability always/only about technology? What about processes and people scalability?

LikeLike
Frank Mashraqi says:

October 6, 2008 at 1:37 pm

Hi Scoble,

I regularly speak at conferences about scalability including 4 sessions this year at MySQL Conference. In the past I helped scale Fotolog (then the 13th largest website on the Internet) to achieve a 10x growth without adding any database server. Most recently I presented a session at Dave McClure’s Startonomics about Startup Scalability Strategies: How to grow up without blowing up. I regularly help candidates preparing for their interviews. You can reach me at 5 5 1 6 5 5 5 5 9 0 and within 30 minutes I can help you cover major ground in terms of strategies, approaches and tools to become scalable.

Regarding at what point a Startup should consider scalability:
http://startonomics.com/blog/startup-scalability-strategies-how-important-is-being-scalable/
http://mashraqi.com/2008/09/startonomics-startup-scalability.html
http://startonomics.com/blog/scalability-for-start-ups-how-to-grow-up-without-blowing-up/

Talk to you soon,

thanks,
Frank

LikeLike
Glen says:

October 6, 2008 at 1:37 pm

High Scalability is a good website to keep up with. They recently had a post on the “7 stages of scaling web apps.” The problem is that so many of these smaller websites simply keep reinventing the wheel; very few of the problems they face are new, and there are known ways of scaling sites, especially at places like Yahoo! and Google that have been doing it for years.

I wrote a blog post on this issue last month at http://blog.broadpool.com/2008/09/23/it-goes-to-11/ . I guess the biggest questions for most listeners would be, “How do I get there? I can’t start from where Yahoo! starts, so how do I build a site so that it can grow over time?” The other question would be, “How do I handle success? What do I need to do to ensure that, should my web app be wildly successful, we don’t die because of it?”

LikeLike
Frank Mashraqi says:

October 6, 2008 at 6:37 am

Hi Scoble,

I regularly speak at conferences about scalability including 4 sessions this year at MySQL Conference. In the past I helped scale Fotolog (then the 13th largest website on the Internet) to achieve a 10x growth without adding any database server. Most recently I presented a session at Dave McClure’s Startonomics about Startup Scalability Strategies: How to grow up without blowing up. I regularly help candidates preparing for their interviews. You can reach me at 5 5 1 6 5 5 5 5 9 0 and within 30 minutes I can help you cover major ground in terms of strategies, approaches and tools to become scalable.

Regarding at what point a Startup should consider scalability:
http://startonomics.com/blog/startup-scalability-strategies-how-important-is-being-scalable/
http://mashraqi.com/2008/09/startonomics-startup-scalability.html
http://startonomics.com/blog/scalability-for-start-ups-how-to-grow-up-without-blowing-up/

Talk to you soon,

thanks,
Frank

LikeLike
Glen says:

October 6, 2008 at 6:37 am

High Scalability is a good website to keep up with. They recently had a post on the “7 stages of scaling web apps.” The problem is that so many of these smaller websites simply keep reinventing the wheel; very few of the problems they face are new, and there are known ways of scaling sites, especially at places like Yahoo! and Google that have been doing it for years.

I wrote a blog post on this issue last month at http://blog.broadpool.com/2008/09/23/it-goes-to-11/ . I guess the biggest questions for most listeners would be, “How do I get there? I can’t start from where Yahoo! starts, so how do I build a site so that it can grow over time?” The other question would be, “How do I handle success? What do I need to do to ensure that, should my web app be wildly successful, we don’t die because of it?”

LikeLike
penguinsix says:

October 6, 2008 at 2:11 pm

I think you should ask whether a startup should plan to scale from the beginning, or concentrate instead of building growth and dealing with scalability later.

You have to bring up the Twitter scenario–ask them there thoughts on why Twitter failed (db scalability).

You should also ask about ‘the cloud’ and the future of scalability. Is what was once a major purchase and commitment (new servers, configing and coordinating) soon to be replaced by ‘cycles on the cloud’?

Ask them about scaling on LAMP vs. scaling on Windows. 🙂

LikeLike
penguinsix says:

October 6, 2008 at 7:11 am

I think you should ask whether a startup should plan to scale from the beginning, or concentrate instead of building growth and dealing with scalability later.

You have to bring up the Twitter scenario–ask them there thoughts on why Twitter failed (db scalability).

You should also ask about ‘the cloud’ and the future of scalability. Is what was once a major purchase and commitment (new servers, configing and coordinating) soon to be replaced by ‘cycles on the cloud’?

Ask them about scaling on LAMP vs. scaling on Windows. 🙂

LikeLike
Steve Borsch says:

October 6, 2008 at 2:18 pm

Thanks for reaching out on this interview prep. I can’t think of anything more important to the future of cloud computing, user experience and lower blood pressure globally than scalability!

Ironic that tomorrow is the 3 year anniversary of a post I did entitled, “Web 2.0 Conference: The Dirty Little Secret”: http://www.iconnectdots.com/ctd/2005/10/web_20_conferen.html where I talked about the complete and utter lack of ANY discussion of scalability or latency. It was all about “just build it” which made me shudder.

Here are some key questions to ask:

1) There are two audiences for scalability, developers and users, but they have one shared goal, performance. Are there any best practices or benchmarks for how fast a web app or page SHOULD parse? Is there a min-max window of performance developers should target?

– Developers want application performance but balanced with a need to optimize conflicting priorities (e.g., delivering fast web apps but needing to wait for an ad server to deliver a personalized advertisement).

– Users don’t care about the monetization demands…they just want the app to work and be nearly as good an experience as a desktop app (though willing to make a trade-off for having stuff in the cloud accessible from anywhere with different device types)

2) We all know there are accelerating internet loads from an ever-increasing number of broadband users, apps in the cloud, and data types like video. We’ve seen wildly conflicting estimates of internet capacity as well. Are there *any* definitive internet infrastructure numbers that’ll help developers, I.T. professionals or anyone creating ‘net-centric strategies, to get a clue about latency, capacity and so forth going forward?

(Need to tell you that this is THE #1 biggest issue with all the startups we cover at Minnov8.com. Without a PhD in network topology and infrastructure, how the hell is a handful of geeks to build and deliver a strategically sound platform or application-set?).

3) “To API or not API: That is the question”. One could argue that the root cause of Twitter being the poster whale for fail was the API. It’s almost comical how many have leapt on the API and are using it for all sorts of apps. So the question might not be “API or not”, but “when to API”?

Good luck.

—
Steve

LikeLike
Steve Borsch says:

October 6, 2008 at 7:18 am

Thanks for reaching out on this interview prep. I can’t think of anything more important to the future of cloud computing, user experience and lower blood pressure globally than scalability!

Ironic that tomorrow is the 3 year anniversary of a post I did entitled, “Web 2.0 Conference: The Dirty Little Secret”: http://www.iconnectdots.com/ctd/2005/10/web_20_conferen.html where I talked about the complete and utter lack of ANY discussion of scalability or latency. It was all about “just build it” which made me shudder.

Here are some key questions to ask:

1) There are two audiences for scalability, developers and users, but they have one shared goal, performance. Are there any best practices or benchmarks for how fast a web app or page SHOULD parse? Is there a min-max window of performance developers should target?

– Developers want application performance but balanced with a need to optimize conflicting priorities (e.g., delivering fast web apps but needing to wait for an ad server to deliver a personalized advertisement).

– Users don’t care about the monetization demands…they just want the app to work and be nearly as good an experience as a desktop app (though willing to make a trade-off for having stuff in the cloud accessible from anywhere with different device types)

2) We all know there are accelerating internet loads from an ever-increasing number of broadband users, apps in the cloud, and data types like video. We’ve seen wildly conflicting estimates of internet capacity as well. Are there *any* definitive internet infrastructure numbers that’ll help developers, I.T. professionals or anyone creating ‘net-centric strategies, to get a clue about latency, capacity and so forth going forward?

(Need to tell you that this is THE #1 biggest issue with all the startups we cover at Minnov8.com. Without a PhD in network topology and infrastructure, how the hell is a handful of geeks to build and deliver a strategically sound platform or application-set?).

3) “To API or not API: That is the question”. One could argue that the root cause of Twitter being the poster whale for fail was the API. It’s almost comical how many have leapt on the API and are using it for all sorts of apps. So the question might not be “API or not”, but “when to API”?

Good luck.

—
Steve

LikeLike
Robert Accettura says:

October 6, 2008 at 2:19 pm

Following the pareto principle (80% rule) what is in your opinion(s) the hitlist of tasks to scale a LAMP application on the web? When do you feel the deadline is in terms of development and launch?

LikeLike
Robert Accettura says:

October 6, 2008 at 7:19 am

Following the pareto principle (80% rule) what is in your opinion(s) the hitlist of tasks to scale a LAMP application on the web? When do you feel the deadline is in terms of development and launch?

LikeLike
Robert Sanchez says:

October 6, 2008 at 3:09 pm

Hi Robert,

Sounds like an amazingly interesting webinar. I guess the question I would ask is, how exactly do you test your efficiency and scalability before hand, so that you can be modestly prepared for that overnight 6 million user count?

Also, will this be available after the conference? I registered on Fast Company, but I will have to working as the conference is live. Thank you Scobleizer

LikeLike
Robert Sanchez says:

October 6, 2008 at 8:09 am

Hi Robert,

Sounds like an amazingly interesting webinar. I guess the question I would ask is, how exactly do you test your efficiency and scalability before hand, so that you can be modestly prepared for that overnight 6 million user count?

Also, will this be available after the conference? I registered on Fast Company, but I will have to working as the conference is live. Thank you Scobleizer

LikeLike
Girish says:

October 6, 2008 at 3:37 pm

For someone that is serious about rejecting job applications if they have spelling errors, you sure don’t care as much about your own blog concerning the same issue now, do ya? 😉

Admonishion? You mean admonition?

LikeLike
Girish says:

October 6, 2008 at 8:37 am

For someone that is serious about rejecting job applications if they have spelling errors, you sure don’t care as much about your own blog concerning the same issue now, do ya? 😉

Admonishion? You mean admonition?

LikeLike
Tim Shisler says:

October 6, 2008 at 4:06 pm

How can you complain about folks not being prepared to send in a job resume and turn around the next day and say you can go into something underprepared and be okay about it?

I know there is a stark difference, but really, come on. If you’re allowed to ask help, why not give folks the benefit of the doubt if they flub a resume and ask them why?

LikeLike
Tim Shisler says:

October 6, 2008 at 9:06 am

How can you complain about folks not being prepared to send in a job resume and turn around the next day and say you can go into something underprepared and be okay about it?

I know there is a stark difference, but really, come on. If you’re allowed to ask help, why not give folks the benefit of the doubt if they flub a resume and ask them why?

LikeLike
Molly Scofield says:

October 6, 2008 at 4:22 pm

Great points raised by all. I’d also love to know, in the face of rapid scale, what success looks like to these guys.

Fast and big can be great (and usually commands attention) but may or may not be a good indicator of success. Likewise, how do they measure success and then revise/redeploy on the fly in response to the data.

LikeLike
Molly Scofield says:

October 6, 2008 at 9:22 am

Great points raised by all. I’d also love to know, in the face of rapid scale, what success looks like to these guys.

Fast and big can be great (and usually commands attention) but may or may not be a good indicator of success. Likewise, how do they measure success and then revise/redeploy on the fly in response to the data.

LikeLike
Guest says:

October 6, 2008 at 4:41 pm

Lots of the questions posed above are rather direct. If your looking to get a conversation going amongst them, I’d ask questions which encourage some “give-and-take” rather than give direction to their answers. You want to hear what these very smart dudes think is most important/relevant about the topic. Being modest, I’d start by assuming they would ask better questions of each other than I could ask of them – after all, we are interviewing them because of their successful expertise.

Questions:

“Individually looking at each others past projects like iLike and Friendfeed, what items in their development do you think were key to the projects success? In a similar vein, what would you have done differently and how would it have made things better?”

“In hindsight, were there key players beneath the surface in these projects which played a large part in making them a success, or was the project so well defined that everyone came together equally in making it a success? If there were key people, what did they provide to the project that you couldn’t provide personally?”

LikeLike
Matt says:

October 6, 2008 at 9:41 am

Lots of the questions posed above are rather direct. If your looking to get a conversation going amongst them, I’d ask questions which encourage some “give-and-take” rather than give direction to their answers. You want to hear what these very smart dudes think is most important/relevant about the topic. Being modest, I’d start by assuming they would ask better questions of each other than I could ask of them – after all, we are interviewing them because of their successful expertise.

Questions:

“Individually looking at each others past projects like iLike and Friendfeed, what items in their development do you think were key to the projects success? In a similar vein, what would you have done differently and how would it have made things better?”

“In hindsight, were there key players beneath the surface in these projects which played a large part in making them a success, or was the project so well defined that everyone came together equally in making it a success? If there were key people, what did they provide to the project that you couldn’t provide personally?”

LikeLike
Freeman says:

October 6, 2008 at 5:11 pm

First off, you need to have Katie Couric interview you on the subject. Do whatever you can to look as clueless as possible during this interview. This will sharply lower everybody’s expectations of your performance during the real interview.

Then it should be easy to show remarkable improvement when the real interview comes around. Everyone will be commenting about how you aced it, regardless of the fact that you will still be the least-informed individual in the room on the subject.

It’s all about perception management. Instead of comparing you to the more-informed subjects of your interview, most people will compare you to your even-less-informed previous performance.

I’m not going to tell you where I got this idea. Just suffice it to say that political strategists are genius!

LikeLike
Freeman says:

October 6, 2008 at 10:11 am

First off, you need to have Katie Couric interview you on the subject. Do whatever you can to look as clueless as possible during this interview. This will sharply lower everybody’s expectations of your performance during the real interview.

Then it should be easy to show remarkable improvement when the real interview comes around. Everyone will be commenting about how you aced it, regardless of the fact that you will still be the least-informed individual in the room on the subject.

It’s all about perception management. Instead of comparing you to the more-informed subjects of your interview, most people will compare you to your even-less-informed previous performance.

I’m not going to tell you where I got this idea. Just suffice it to say that political strategists are genius!

LikeLike
svetainiu kurimas says:

October 6, 2008 at 5:33 pm

Thanks for post like this!!!

LikeLike
svetainiu kurimas says:

October 6, 2008 at 10:33 am

Thanks for post like this!!!

LikeLike
Robert Scoble says:

October 6, 2008 at 6:11 pm

Tim: >>How can you complain about folks not being prepared to send in a job resume and turn around the next day and say you can go into something underprepared and be okay about it?

Who is going in unprepared. Seems this blog post just prepared me in a BIG way for Thursday!

LikeLike
Robert Scoble says:

October 6, 2008 at 11:11 am

Tim: >>How can you complain about folks not being prepared to send in a job resume and turn around the next day and say you can go into something underprepared and be okay about it?

Who is going in unprepared. Seems this blog post just prepared me in a BIG way for Thursday!

LikeLike
Tim Shisler says:

October 6, 2008 at 7:20 pm

Very true Robert. It was more of a gut reaction that I should’ve thought more about. Thanks for replying.

LikeLike
Tim Shisler says:

October 6, 2008 at 12:20 pm

Very true Robert. It was more of a gut reaction that I should’ve thought more about. Thanks for replying.

LikeLike
Anonymous says:

October 6, 2008 at 7:24 pm

I have had my share of building scalable systems in the past — and the one thing that always came back to me was the question: is the chosen application architecture adequate to support the needs?

There are some basic topologies that everybody uses:

a) “the sink” uses the database as repository of every message — write it first, let it be read second; parallelize/cluster the db server and you easily crank up the volume. This model is very popular amongst web 2.0 systems. It is limited though, as adding a db server only brings 0.75 more power.

b) “the network” uses pipes to distribute messages between writers and readers — very popular in the telco industry, where speed is master and geography plays an important role. Parallelize writers and readers and you get a linear scale that depends alone on the number of servers you put in. Big disadvantage is that distributed persistence needs to be consolidated at some point — can generate some difficult to tame data flows.

Now, ask yourself: which model was chosen by google? and which by twitter?

LikeLike
maa says:

October 6, 2008 at 12:24 pm

I have had my share of building scalable systems in the past — and the one thing that always came back to me was the question: is the chosen application architecture adequate to support the needs?

There are some basic topologies that everybody uses:

a) “the sink” uses the database as repository of every message — write it first, let it be read second; parallelize/cluster the db server and you easily crank up the volume. This model is very popular amongst web 2.0 systems. It is limited though, as adding a db server only brings 0.75 more power.

b) “the network” uses pipes to distribute messages between writers and readers — very popular in the telco industry, where speed is master and geography plays an important role. Parallelize writers and readers and you get a linear scale that depends alone on the number of servers you put in. Big disadvantage is that distributed persistence needs to be consolidated at some point — can generate some difficult to tame data flows.

Now, ask yourself: which model was chosen by google? and which by twitter?

LikeLike
Anonymous says:

October 6, 2008 at 7:48 pm

Robert,

Services are fascinating, for a couple of reasons. Sure, there are technical “service scaling” issues, which the other folks on the panel will know all about. Matt has great stories on the scaling of Akismet. But far more interesting are the human scaling issues. I always find more thoughtful discussions there.

First, there are terms-of-use issues. When you get to a certain size, you need to have policies for appropriate use. You wind up creating competitors, and an ecosystem emerges around your service. Look at Twitter, and the emergence of complementary products like Summize. Then look at what happened when Notchup exploited Linkedin to grab a bunch of users. How you monitor use and enforce terms of use is a big question, and it goes far beyond simple APIs and scaing.

Second, there’s the fact that we’re building human APIs. APIs and web services are typically focused on letting machines talk to one another. But by tying realtime activity feeds to our mobile devices, or location-based services that report our coordinates, we’re plugging humans into applications, Amazon Turk style. As humans start to interact with applications via web services, through mobile devices and so on, a whole new set of scaling issues emerge.

Since you’re the higher-level, human-angle participant on the panel, I’d elevate things beyond bits and bytes and into humans, policies, exploitation, and startup ecosystems.

Not sure that helps… sounds like it’ll be a great panel!

A.

LikeLike
Alistair says:

October 6, 2008 at 12:48 pm

Robert,

Services are fascinating, for a couple of reasons. Sure, there are technical “service scaling” issues, which the other folks on the panel will know all about. Matt has great stories on the scaling of Akismet. But far more interesting are the human scaling issues. I always find more thoughtful discussions there.

First, there are terms-of-use issues. When you get to a certain size, you need to have policies for appropriate use. You wind up creating competitors, and an ecosystem emerges around your service. Look at Twitter, and the emergence of complementary products like Summize. Then look at what happened when Notchup exploited Linkedin to grab a bunch of users. How you monitor use and enforce terms of use is a big question, and it goes far beyond simple APIs and scaing.

Second, there’s the fact that we’re building human APIs. APIs and web services are typically focused on letting machines talk to one another. But by tying realtime activity feeds to our mobile devices, or location-based services that report our coordinates, we’re plugging humans into applications, Amazon Turk style. As humans start to interact with applications via web services, through mobile devices and so on, a whole new set of scaling issues emerge.

Since you’re the higher-level, human-angle participant on the panel, I’d elevate things beyond bits and bytes and into humans, policies, exploitation, and startup ecosystems.

Not sure that helps… sounds like it’ll be a great panel!

A.

LikeLike
Geoffrey Wiseman says:

October 6, 2008 at 9:19 pm

There’s a few areas I’d want to cover:
– Where does scalability cross paths with standards (WS-*, WSDL, UDDI) vs. simplicity (REST, POX+HTTP).
– Do these people think that ESB has a place in their view of scalable services?
– How should you look at databases differently in a service-oriented model? (There’s all sorts of sub-topics here; do services share a database, or have their own, does it vary? 2PC vs. other kinds of synchronization? Clustered caches vs. RDBMS?)
– What mechanisms are important to keep instances in synch and sharing work without tripping over each other or creating new bottlenecks (messaging, database, clustered caches, etc.)?
– Are location-transparency/routing and directory/discovery services important to scalable SOA, or is this simply the job for DNS and load balancers?

There’s tons of interesting things to talk about here, really.

LikeLike
Geoffrey Wiseman says:

October 6, 2008 at 2:19 pm

There’s a few areas I’d want to cover:
– Where does scalability cross paths with standards (WS-*, WSDL, UDDI) vs. simplicity (REST, POX+HTTP).
– Do these people think that ESB has a place in their view of scalable services?
– How should you look at databases differently in a service-oriented model? (There’s all sorts of sub-topics here; do services share a database, or have their own, does it vary? 2PC vs. other kinds of synchronization? Clustered caches vs. RDBMS?)
– What mechanisms are important to keep instances in synch and sharing work without tripping over each other or creating new bottlenecks (messaging, database, clustered caches, etc.)?
– Are location-transparency/routing and directory/discovery services important to scalable SOA, or is this simply the job for DNS and load balancers?

There’s tons of interesting things to talk about here, really.

LikeLike
Stephen Pierzchala says:

October 6, 2008 at 9:31 pm

Everyone has the internal component covered. I come at it from the outside.

How do you measure and monitor the external performance of your Web API to proactively deal with loading, connectivity, and application issues?

What is the issue you run up against most: bandwidth bottlenecks or application loads?

smp

LikeLike
Stephen Pierzchala says:

October 6, 2008 at 2:31 pm

Everyone has the internal component covered. I come at it from the outside.

How do you measure and monitor the external performance of your Web API to proactively deal with loading, connectivity, and application issues?

What is the issue you run up against most: bandwidth bottlenecks or application loads?

smp

LikeLike
Matt M. says:

October 7, 2008 at 1:02 am

?: It seems like a lot of problems with scalability start with the current Database Server architecture. Should a new startup looking to scale start with an alternate? Are there production ready open source or commercially supported alternatives? What would it take for an alternate to be considered “safe” like a SQL Database Server?

LikeLike
Matt M. says:

October 6, 2008 at 6:02 pm

?: It seems like a lot of problems with scalability start with the current Database Server architecture. Should a new startup looking to scale start with an alternate? Are there production ready open source or commercially supported alternatives? What would it take for an alternate to be considered “safe” like a SQL Database Server?

LikeLike
paul says:

October 7, 2008 at 10:31 am

How do you provide guaranteed failover when ultimately your service has one IP address, cached inside the world’s network of DNS servers? Anyone know, by the way?

LikeLike
paul says:

October 7, 2008 at 3:31 am

How do you provide guaranteed failover when ultimately your service has one IP address, cached inside the world’s network of DNS servers? Anyone know, by the way?

LikeLike
Ryan says:

October 7, 2008 at 2:05 pm

@paul: One easy thing you can do is high-availability failover where a separate server takes over the IP address (and other characteristics) of your failed server. Check out http://www.linux-ha.org.

LikeLike
Ryan says:

October 7, 2008 at 7:05 am

@paul: One easy thing you can do is high-availability failover where a separate server takes over the IP address (and other characteristics) of your failed server. Check out http://www.linux-ha.org.

LikeLike
Pete Austin says:

October 7, 2008 at 4:31 pm

You need to decide how scalable you want it to be, because the architecture gets trickier with each power of 10 and the bugs subtler. You can give up on getting any sensible advice on the InterWeb for a start. With massive systems, the principle is to design your service as a cluster of sites which are as independent as possible in normal use, but which provide disaster-recovery for each other. The bottlenecks are in things like assigning a user to a site and passing messages between sites.

LikeLike
Pete Austin says:

October 7, 2008 at 9:31 am

You need to decide how scalable you want it to be, because the architecture gets trickier with each power of 10 and the bugs subtler. You can give up on getting any sensible advice on the InterWeb for a start. With massive systems, the principle is to design your service as a cluster of sites which are as independent as possible in normal use, but which provide disaster-recovery for each other. The bottlenecks are in things like assigning a user to a site and passing messages between sites.

LikeLike
jonas says:

October 7, 2008 at 5:26 pm

follow this blog if you want to know more 🙂

http://highscalability.com/

LikeLike
jonas says:

October 7, 2008 at 10:26 am

follow this blog if you want to know more 🙂

http://highscalability.com/

LikeLike
jonas says:

October 7, 2008 at 5:27 pm

( http://highscalability.com/start-here )

LikeLike
jonas says:

October 7, 2008 at 10:27 am

( http://highscalability.com/start-here )

LikeLike
Niall Kennedy says:

October 7, 2008 at 6:06 pm

What are your current bottlenecks? If you could install a new piece of software or rack a new piece of hardware tomorrow to solve key pain points, what would it do?

LikeLike
Niall Kennedy says:

October 7, 2008 at 11:06 am

What are your current bottlenecks? If you could install a new piece of software or rack a new piece of hardware tomorrow to solve key pain points, what would it do?

LikeLike
Ted Murphy says:

October 7, 2008 at 6:39 pm

Ask them for one scary example in their past where they screwed up and the app hit the wall. And ask them how they fixed it.

LikeLike
Ted Murphy says:

October 7, 2008 at 11:39 am

Ask them for one scary example in their past where they screwed up and the app hit the wall. And ask them how they fixed it.

LikeLike
sidereal says:

October 7, 2008 at 7:03 pm

1) At what point in the lifecycle of your product do you start to focus on high scalability. How willing should you be to compromise performance or development time to reach it before it’s strictly necessary?

2) Is true horizontal scalability ever achievable or will there always be some shared resources?

3) Is designing for scalability a reactive or proactive process?

4) Are we too eager to abandon the benefits of normalization to achieve scalability?

5) Traffic patterns, especially for smaller sites, can be extremely volatile and unpredictable. How to you design for that?

6) What’s the time horizon on scalability becoming a commodity that’s provided by hosting providers alongside power and ethernet?

LikeLike
sidereal says:

October 7, 2008 at 12:03 pm

1) At what point in the lifecycle of your product do you start to focus on high scalability. How willing should you be to compromise performance or development time to reach it before it’s strictly necessary?

2) Is true horizontal scalability ever achievable or will there always be some shared resources?

3) Is designing for scalability a reactive or proactive process?

4) Are we too eager to abandon the benefits of normalization to achieve scalability?

5) Traffic patterns, especially for smaller sites, can be extremely volatile and unpredictable. How to you design for that?

6) What’s the time horizon on scalability becoming a commodity that’s provided by hosting providers alongside power and ethernet?

LikeLike
matt m says:

October 7, 2008 at 7:46 pm

How much consideration was given to scalability issues in your initial design?

What decisions, from your initial design, presented the largest hurdles to scaling?

What issues will you ensure are taken into consideration in your next version 0 design?

LikeLike
Kent Langley says:

October 7, 2008 at 7:46 pm

My Question:

How have you changed the way you develop software to improve scalability?

LikeLike
matt m says:

October 7, 2008 at 12:46 pm

How much consideration was given to scalability issues in your initial design?

What decisions, from your initial design, presented the largest hurdles to scaling?

What issues will you ensure are taken into consideration in your next version 0 design?

LikeLike
Kent Langley says:

October 7, 2008 at 12:46 pm

My Question:

How have you changed the way you develop software to improve scalability?

LikeLike
robw says:

October 7, 2008 at 7:53 pm

A good conversation starter would be to ask them about which dimensions they found to be the most challenging to scale along.

Software/system architecture is only one aspect of successfully building a scalable system. Operations, development/deployment processes, monitoring and so on all make a big different to the scalability of any system.

LikeLike
robw says:

October 7, 2008 at 12:53 pm

A good conversation starter would be to ask them about which dimensions they found to be the most challenging to scale along.

Software/system architecture is only one aspect of successfully building a scalable system. Operations, development/deployment processes, monitoring and so on all make a big different to the scalability of any system.

LikeLike
frederic sidler says:

October 8, 2008 at 9:35 pm

Hosting
Which provider would they choose today if they would need to built a new service that could face a scalability problem.

Limitation/Marketing
Is an invitation procedure a good way to manage the number of users accessing the platform. Would it frustrate the people not authorized to access the system or would it force them to find alternative ways to beta test it 😉

LikeLike
frederic sidler says:

October 8, 2008 at 2:35 pm

Hosting
Which provider would they choose today if they would need to built a new service that could face a scalability problem.

Limitation/Marketing
Is an invitation procedure a good way to manage the number of users accessing the platform. Would it frustrate the people not authorized to access the system or would it force them to find alternative ways to beta test it 😉

LikeLike
Pingback: Scobleizer — Tech geek blogger » Blog Archive Thanks for help on Scalability Questions «
Pingback: Thanks for help on Scalability Questions | Open Source Blogging
Pingback: Scalability talk with the experts - Social media, web development and digital life - JungleG
Pingback: IT Project Failures mobile edition
Pingback: Internet Marketing Blog » Blog Archive » (Internet Marketing) Made Easy