To continue our exploration of best practices for shape development we will look at grouping in this post. Grouping is a convenient way of packaging shapes and is the mechanism for creating many complex shapes. Visio offers a number of capabilities for groups, from special interaction behaviors to shape transform changes. It's not surprising to see groups used extensively in master shapes. However, grouping can easily be taken too far, and you can encounter serious performance issues when that happens. Let's look at the issues in detail.
First, a group in Visio is a shape. Prior to Visio 2000 groups were special objects that were not quite the same as shapes. They were containers for shapes but could not have their own text or geometry. Beginning with Visio 2000 groups became full-fledged shapes. Today a group can have geometry and text just like any other shape. A group can be either 1-D or 2-D. It is important to note that groups are shapes too because this is one of the simplest and best optimations to make in shape design. Visio's overall performance is highly dependent on the total number of shapes being managed. A group containing two sub-shapes is three shapes to Visio. If you are using the group merely as a container, you are wasting one shape. Convert one of the sub-shapes into a group and stick the other sub-shape inside. Now you only have two shapes to provide the same functionality.
Second, there is a penalty for using groups in Visio. Every group defines a new coordinate system for the contents of the group. There is overhead required to compute the transforms for these coordinate systems to display shapes on a page. The deeper the nesting of groups the more coordinate transformation that Visio must do. Thus if you must use groups, it is better to keep the structure as flat as possible. We recommend avoiding making any shapes with more than two levels of nested groups.
The most common problem we see in custom shapes (and in some of the ones in the Visio box) is that groups are used where they are convenient, not just where they are required. This results in too many shapes being used in a master. When you are designing your custom master shape, determine the minimum number of shapes required to do the job. An additional shape is necessary when you need another piece of text separate from the shape text you already have or when you need a portion of the shape to have different line or fill formatting than the rest. These are the two conditions: more text or different formatting. You do not need a new shape for more geometry. Visio supports multiple geometry sections per shape. As long as these sections can use the same formatting, this is much more efficient than creating a group.
Let's look at an example to see the difference between grouping and geometry sections. Imagine a master consisting of 16 squares. It's easy to create this shape by drawing one square in Visio, then duplicating it 15 times. If you group all 16 squares together, you now have a group with 16 sub-shapes. (A colored box is drawn around the group just for illustration purposes.) Some basic performance testing shows that Visio takes 74 milliseconds to create 50 instances of the master on a page. Each instance requires 64.4 kilobytes of memory.
Compare that master to an similar one that uses multiple geometry sections to accomplish the same task. After duplicating the square to make 16 squares total, a Combine shape operation was run to make a single shape with 16 geometry sections. Because there is only one shape, no group is required. Performance testing shows that 50 instances take 31 milliseconds to create, and Visio requires 36.6 kilobytes of memory for each instance. Thus you get better than twice the performance and almost half the memory consumption as the group shape. The differences are more dramatic as you apply local formatting as we saw in a previous post. In this example, there is no reason to use a group because a single shape can accommodate all the necessary geometry. Additional shapes are only required if multiple pieces of text or different sets of formatting are needed.
When you do use groups make sure to keep things flat. Additional nested levels add more shapes and more overhead. Pretend that our example master was created by drawing two squares, then grouping them, then duplicating the group, then grouping that, then duplicating and grouping by two until a single group remained. The original 16 squares have become a 31 shape master.
Over the years we have received many customer drawings where performance, memory and file size are severely impacted because of too much grouping. Often it is too late to tell customers to redesign their masters to economize on shape use. Where customers have made changes, the results can be dramatic. Improvements from 2x to 10x are typical. A common excuse heard from shape designers is that they use nested groups because the masters are assemblies of standard components. The shape designers want to keep the components contained within their own groups to facilitate maintainability. Before commiting to this structure we recommend testing the performance of the shapes to know whether the tradeoff in inefficiency is acceptable.
For those that are interested in real-world examples of complex but efficient group shapes, look at the Data Graphics callouts in the new Visio 2007 Professional product. These shapes follow the strict rules for when an additional shape is required, but they have some impressive graphical looks as well.
In the previous post we looked at the difference between using instances of master shapes versus using masterless shapes. Visio masters store shape information centrally, reducing memory and increasing performance. Shapes on the page inherit their properties from the master shape until the user makes a local change to override the inherited values. Today we want to analyze another aspect of inheritance and its impact on performance.
User-defined cells are a great way for shape designers and solution developers to add intelligence to a Visio shape. Unlike Scratch cells they support custom row names, making the Shapesheet more readable and understandable. The section name "User-defined" is a misnomer since these are designer / developer defined cells. They can be used to store information that does not need to be directly exposed to the users of the shape.
Suppose we have a solution that adds a new User-defined cell to store a timestamp recording when the shape was dropped. We might have some code that responds to the ShapeAdded event from Visio by inserting a User-defined section, adding a new named row and then setting the cell value. Alternatively, we could redesign our master shape to already have a User-defined section and the necessary named row in it. Our code only needs to set the cell value.
The first approach relies on working exclusively through the Visio API. The second approach relies on inheritance to provide the User-defined row. What are the performance implications here? We ran a little experiment programmatically dropping 100 shapes on the page. Each shape had 20 User-defined rows in each. In the first test, the User-defined rows were generated through code. The test took 688 milliseconds and the resulting memory usage was 41 kilobytes for the document. Compare that data to the inheritance scenario where the User-defined rows already existed and the code only needed to set cell values. This test took 360 milliseconds and the resulting memory usage was 36 kilobytes for the document.
The difference is that the inheritance scenario is almost twice as fast as the non-inheritance scenario. Most of this performance delta is caused by the additional work Visio must do to add new rows to the Shapesheet. Using inheritance lets Visio skip this step. Some of the performance delta is related to the increased memory consumption caused by having local copies of the User-defined rows in each shape. In the inheritance scenario, the User-defined rows are inherited but there are local values in the Value cell for each row. In the non-inheritance scenario, the rows themselves are also local.
There is an important issue to be aware of when working with User-defined rows - or any Shapesheet section with nameable rows. Adding a new row to the section breaks the inheritance of all existing rows in the section. The rows and cell values are forced to become local. This can adversely affect performance, memory consumption and file size because one addition can ruin the party for every other row. In our testing, this has added a 2% - 5% performance penalty to shapes with existing inherited User-defined rows. The penalty is caused by the increased memory requirements for the local row and cell data.
To summarize, programmatically adding new rows to the Shapesheet is a fairly expensive operation. It is far more efficient to pre-build the rows you need into your master shapes. Also adding a new row to a shape causes all the rows in that Shapesheet section to become local. This creates an ongoing performance impact because of the greater memory usage for the local data. Next time we'll look at the benefits of shallow nesting for group shapes as we continue to explore performance optimizations in Visio.
In the previous two posts we looked at the capabilities of the SetAtRef function in the Shapesheet. This function along with the helper functions SetAtRefExpr and SetAtRefEval allows a shape designer to keep one or more cells synchronized. One important detail that was omitted was the fact that all three functions were introduced in Visio 2003.
Today we explore another Visio 2003 function that provides advanced shape behavior capabilities: the Bound function. Bound has two basic characteristics. First, the function takes any value and ensures that it lies within a set of boundary limits for the cell. If the value is within the boundary limits, Bound returns the value directly. If the value is not within the boundary limits, Bound returns the closest limit value. The second characteristic is that the Visio drawing window is aware of cells with the Bound formula and can visually enforce the boundary limits.
Here is the syntax for the Bound function:
BOUND (value, type, ignore, value1, value2 [,ignore(n), value1(n), value2(n),...])
value The current value being constrained.
type Whether the constraint is inclusive (0), exclusive (1), or disabled (2).
ignore TRUE to ignore the range; FALSE to constrain the value of the cell to the range.
value1 First value in a range.
value2 Second value in a range.
Let’s look at some examples to better understand this function. A Visio document with these examples is attached to this post for you use with Visio 2003. It is helpful to see firsthand how Visio enforces Bound as you manipulate the shape.
In the first example we want to keep the Width of the shape between 1 and 3 inches. This is an example of an inclusive type. As long as the incoming cell value is between 1 and 3 inches, the Bound function will return that value. If the incoming cell value is less than 1 inch, the function returns 1 inch. If the incoming cell value is greater than 3 inches, the function returns 3 inches. If you resize this shape in Visio, you will see that Visio acknowledges the Bound and prevents resizing outside the allowable range.
In this respect, Bound behaves similarly to SetAtRef. Any change made to the cell through the drawing window does not blast the formula already in the shape. Instead, Visio places the incoming cell value into the first argument of the Bound function. The first argument is in fact an implicit SetAtRefExpr function. We don’t show “SetAtRefExpr” to keep the Bound function syntax from getting too confusing, but that is what Visio is doing behind the scenes. Any shape Width that you attempt to set in the drawing window is pushed into the first argument.
The second example demonstrates the use of multiple ranges in Bound. The incoming cell value must fall within one of these ranges. The value1 and value2 arguments are set to the same value in this example to force the shape Width to a specific set of values. When working with multiple ranges, it may be desirable for some of the ranges to be disabled under specific conditions. The ignore argument prior to each range supports disabling of individual ranges.
The third example applies Bound to a control handle (you'll have to see this one in Visio). You can constrain the position of the control handle to stay within the bounding box of the shape.
The final example forces a control handle to remain outside the bounding box of a shape (you'll have to see this one in Visio). This is an example of an exclusive type. The control handle must not be within the range limits. There are some complications that must be addressed, however. It is okay for the horizontal position of the control handle to be within the limits if the vertical position is not within the limits. Likewise it is okay for the vertical position of the control handle to be within the limits if the horizontal position is not within the limits. This extra logic is placed into the ignore argument for the range limits. Note that these functions create a circular reference, but Visio can work through the circularity to produce the correct results.
Like SetAtRef, the Bound function opens a whole new set of opportunities for shape behavior. Shape designers can incorporate this behavior to create Smart Shapes with sensible restrictions on the way they can be manipulated.
One of the things we'd like to accomplish with this blog is to highlight some performance optimizations that shape designers and developers should consider adopting as best practices. Unfortunately designing highly efficient shapes and solutions is a bit of a black art in Visio. Today we will look at the fundamental issue of using masters.
Master shapes and instance shapes are described in this post. A master shape allows Visio to store a single copy of a shape definition and only manage the information that is different for each instance of a shape. For many shapes, only the Shape Transform section of the Shapesheet is specific (or "local" as we call it) to each instance. That section contains the PinX and PinY coordinates for the shape's location on the page. As you modify a shape on the page, more Shapesheet cells take on custom values rather than inherit their values from the master shape. Thus Visio requires more memory to manage this information.
A masterless shape, such as a line or rectangle drawn with the drawing tools is essentially 100% local values. That means Visio must store the complete set of Shapesheet properties for each and every shape on the page. Let's compare Visio's performance when working with instances of masters versus masterless shapes. We'll just create a master that consists of a plain rectangle for the master shape. We'll draw a plain rectangle using the drawing tools for a masterless shape. Then to get a better reading on the incremental costs of working with these shapes, we duplicate each 100 times. The numbers below are based on some tests done with Visio 2003.
When comparing memory consumption, the masterless rectangle requires about 11,700 bytes per shape. The master instance rectangle requires about 3,600 bytes per shape. The difference gets larger if you apply formatting to the rectangles. The masterless rectangle with custom line, fill and text colors set requires about 33,000 bytes per shape. The master instance rectangle only uses 6,300 bytes per shape. Note that Undo was enabled for this, which would exaggerate the numbers a bit.
These differences also impact the file size of the Visio document. A blank Visio document uses about 18 kilobytes of space. A document with 100 masterless rectangles uses about 52 kilobytes. The 100 master instance rectangles use about 45 kilobytes. Thus when you subtract out the document overhead, masterless requires 26% more space.
Finally, we can look at the time it takes Visio to create 100 shapes on the page. This was measured with Undo disabled and a little automation script to drop the shapes. 100 masterless rectangles were created in about 95 milliseconds. 100 master instance rectangles were created in about 47 milliseconds. Much of the performance differences relates to the extra memory footprint required for the masterless rectangles.
Hopefully this gets you thinking about the efficiency of your own content and code. Do you generate large diagrams by drawing lots of lines and rectangles? You should think about masters. Do you work with masters but heavily customize the shape instances once they are on the page? Perhaps you can consolidate the customizations into a few masters that can be instanced. Or perhaps you can place many of the customizations into a Visio style and apply that to your shape instances. Do a little measurement with Windows Task Manager yourself to understand what the per shape memory costs are.
We'll take a look at additional performance optimizations in some future posts.