Over the years I've gained a healthy respect for the value of simulation. Abstract algorithm simulation can help you fine-tune your core concepts and transaction structure. Low-level network simulation can give you a controlled environment in which to evaluate and debug your protocol and implementation. Each has a unique value proposition for development.

For high-level simulators, I'm used to seeing either mathematical simulations, or transaction-level simulations, where a transaction exchange is modeled in its impact on node state. Network packets aren't modeled, just the updates you hope network exchanges will make to node state. These I'm used to seeing as one-off implementations, because the algorithm being simulated varies from protocol to protocol.

Low-level simulaltors tend to have a more focussed mission, such as acting as a transport replacement. The first one that springs to mind is WiDS. These are a great way to validate protocols which ship a low-to-moderate amounts of data, and for which you already have a full implementation.

I have a need somewhere between these two. My protocol is already designed, and I want to model artifacts which won't show up in a mathematical/numerical simulation. The protocol's focus is shipping large amounts of data around, which makes it inappropriate for a full transport-level simulator: I wouldn't be able to simulate a cloud of any reasonable size.

So, I find myself making my own simulator I'm tentatively calling a packet-level simulator. It stops just short of constructing actual packets, but it tracks the size of packets exchanged for each transaction, and per-node limits on processing speed, bandwidth, and accessibility. The simulation's primary goal is modeling the impact on nodes of the actual protocol and node speed effects. As long as I can model the interaction of data transmissions and connect attempts, then I'm not so concerned about validating the transcoding of packets.

My simulator is working fairly well. I'm on my third iteration of optimizations.

The first iteration was a relatively naive approach, where I recalculate the speed of data flow across all connections each time a message started or finished transmission anywhere in the simulation. Although accurate, I could only run at about 20x realtime.

The second iteration involved adding a targeted optimization for the simplest case of a message transition where the impact on transmission speeds was restricted to the node sending that message. In the sorts of networks I'm modeling, this hits about 80%-90% of the cases, which results in an effective 5x-10x speedup.

Still, the simulator isn't quite where I want it to be. More than 96% of all CPU time is consumed in the graph speed updates for the cases that miss the optimization above. Simulating 200 nodes for 18 hours took several hours, and I want to simulate clouds of at LEAST 500 nodes for 24-hour stretches.

I'm working on what I hope will be the final optimization, optimizing changes to packet transfer which affect up to two hops away from the sender / receiver of the message. This is proving to be quite difficult. I'm not aware of anything I can grab from graph theory, because I'm not optimizing flow: each node can be thought of as a source of data, with the goal of sending that data to all of its neighbors as quickly as possible. There's no multi-hop data flow to speak of, though: A sends one set of data to B, B sends a different set of data to C. The graph is directed with degree up to 8. There can be multiple edges going the same direction between any pair of nodes, with different messages flowing over them.

I've been trying to wrap my head around this third stage for two weeks now. Each time I get close I find an exception to my logic that makes me question all of the progress I've made on it. Hopefully I'll get it sorted by the end of this week.In the meantime, it helps to blurt out the scope of the problem. :)

If you ever find yourself working on a similar set of problems, I'll be happy to detail a little deeper what I'm doing and how it's useful. I'm considering making the packet-level simulator publicly available once I get it working well enough, and release at least one paper based on its results. If this would be useful to you, please let me know.