Grady Booch notes a great post by ckw on Stress-driven development.
I've long been an advocate of reversing the "premature optimization is evil" mantra when it comes to the design of system architecture.
In my recent past life in building enterprise systems, my top three pain points were:
  1. Getting enough time with the right people in customer organizations to get reasonable specs.
  2. Deploying complex systems reliably and repeatably.
  3. Being hit by non-functional requirements (especially perf goals) late in the lifecycle.
My old group got pretty darn good at (2) along with some of our partners. However (1) and (3) are still bugbears as far as I'm aware.
Now in many ways, (3) is just a special case of (1) and I'm here to tell you that I don't have many good answers to (1) - I'm not sure that I believe the agilists do either to be honest.  However, I do think there is a better way to get to an agreement earlier with (3) simply because a supplier has the right domain knowledge to be able to provide one possible answer.
Typically the only performance goals you can get from your customer without spending a lot of time with a lot of hard-to-get-hold-of people will be of the form "The whole population of Ecuador must be able to transact concurrently with a response time of 1 microsecond" or "Erm, we'll we've got 300 users on the current system and it seems quite slow".  If you spend that time without putting anything more concrete and useful on the table then it might be too late to design your architecture to meet the goals.
I found the best starting point for driving a realistic discussion was to go in with realistic numbers for what is achievable at the expected cost.  What you need to do is to build out a representative rig of the system architecture, including the packaged products and middleware involved in the solution and drive it via a stress rig.  Simulate things like WANs with whatever tools you have available - make worst case guesses if you need to - better yet build out a link.  Do your best to get the actual hardware you'll deploy onto.  Deploy just enough custom code to reach all the way through the architecture without implementing any real functionality.
Now what numbers do you have?  Can they get worse than this?  Sure - almost all the custom code you write will degrade your performance from these numbers.  But you have fairly fine-grained control over this code and you can do optimization late in the day.  Can you optimize the hardware, middleware, packages and network infrastructure - sure, but it is typically a much coarser grained process and much harder to do late in the cycle when hardware and licenses have been procured.
Talk to your customer about the numbers you're getting with this proposed architecture.  Going in with a line like "With the current plan it can't get better than X" certainly opens a few ears.  If they really need more, you've got time to work out how to miss out a network hop with a cache or double spec the number of disk spindles somewhere.
And of course, once you have this infrastructure, keep it running permanently to catch the points in development where you dip below your targets.