Welcome to MSDN Blogs Sign in | Join | Help

String Optimization. How does it work?

In this post: Quiz: String performance optimization, I showed some very similar code with drastically different performance.

 

When VFP sees an assignment statement like this:

            var =  <something>

here’s what happens. The variable on the left is analyzed, the expression on the right is evaluated, and the result is assigned to the variable.

Expression evaluation is done with a recursive RPN evaluator (Reverse Polish Notation, like HP calculators) . For example, suppose the assignment were

            var =  x + “test”

The evaluator will push the value of x on the internal stack, then push the string “test”, then call the Add routine. The Add routine pops off the top two stack items, adds them, then places the result on the stack.

Now I can see light bulbs lit in the reader’s mind! The evaluator pushes values on the stack and calls routines to process those values, expecting the result at the top of the stack.

One can now imagine how more complex expressions are evaluated using RPN.

 

Now consider the original sample and imagine that x is a very long string.

 

x =  "OneTwoThree” + x

 

The 1st string value is pushed, then the value of x is pushed on the stack. Pushing a string on the stack means memory is allocated for the string, then that string is copied into that allocation. The Add routine is called. It pops off the 2nd value, calculates the total length required for the sum, resizes the top of stack (which is the 1st value) and appends the 2nd (long) string to the new allocation and returns, with the result at the top of the stack. Then the Assign routine takes the top of stack and copies it to the variable x (which means another allocate and copy).

How many memory allocations and copies was that for the long string?

 

This algorithm worked fine for over a decade. For certain scenarios, customers presented scenarios where string operations were quite slow.

These were typically building up strings using concatenation:

            x=x+”something”

            x=x+”something2”

            x=x+”something3”

Eventually, x was so large that performance was noticeably poor.

One can see from the above analysis that there were many string allocations and copies. In the concatenation case, it’s easy to see that many of the allocations/copies can be avoided.

 

Sooo, 8 years ago for VFP6 I added some code (a couple hundred lines!) to the Assign routine to recognize the string append case and do the right thing. This resulted in hundreds of times faster performance in many scenarios.

 

 

As Stuart Ballard pointed out, an additional difference in the performance could have been because the string was being prepended in one case, and appended in the other.

 

Of course, an optimizing compiler can analyze the code and just create a string of REPLICATE(“OneTwoThree”,30000), which is instantaneous.

 

 

See also Steve  Black’s Text and String Handling in VFP and (thanks to Frank Camp) Markus Egger and Rod Paddock wrote an article about the string performances in VFP and C#: http://www.eps-cs.com/VFPConversion/Article.aspx?quickid=030054

 

Published Thursday, September 29, 2005 11:22 AM by Calvin_Hsia
Filed under:

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

# re: String Optimization. How does it work?

Thursday, September 29, 2005 2:37 PM by Fabio Lunardon
Because not to optimize STUFF(var,n,lr,sr) when lr=LEN(sr).
it would be extremely useful.

# re: String Optimization. How does it work?

Thursday, September 29, 2005 2:38 PM by Fabio Lunardon
Why not to optimize STUFF(var,n,lr,sr) when lr=LEN(sr).
it would be extremely useful.

# re: String Optimization. How does it work?

Sunday, October 02, 2005 7:38 PM by Garrett Fitzgerald
Unfortunately, this optimization can fall down badly in recursive algorithms. Run the code from http://www.capescience.com/webservices/globalweather/GWFoxpro.html as it stands. Then change the line

lcRetVal = "" + lcRetVal + ;

to

lcRetVal = lcRetVal + ;

and note how the returned string loses a lot of its content.

# String Optimization: More detail

Thursday, September 27, 2007 8:41 PM by Calvin Hsia's WebLog

In the prior post ( String Optimization. How does it work? ) I described an optimization of how the Assignment

Leave a Comment

(required) 
required 
(required) 
 
Page view tracker