1.   Passive Data

Data, as we know it today, resides quietly in designated data stores, and allows external services to hammer on it without losing its quality. Active logic is coded in services, not in data. It is services who decide how to transport, reproduce, and/or reshape data. Being passive, data is susceptible to unlimited number of attacks of uncontrolled brutality. Here is an example:

 

“Secure” File Systems

Most file systems today support some sort of an access control to their files and folders. Although access control information is stored as part of the file system, enforcing it is an activity performed by the operating system, which is a service separate from the file system. Any unprivileged user can access every single file and folder on a Windows/Linux dual boot machine. Install this NTFS driver in Linux http://data.linux-ntfs.org and see what the SYSTEM account can do. Alternatively, install this Ext2/Ext3 driver in Windows http://www.fs-driver.org and feel like a Linux root.

 

It’s been long since the software industry has recognized the need to activate data – structs and functions have been combined into classes with public and private members. Great! If that need has been obvious for source code, why is it not obvious for compiled products? Data units must implement their own processing functionality! Here is a good example:

 

Magnetic Stripe vs. Smart Card

If an attacker gets hold of a magnetic-stripe card, all they need is a reading device and a software driver, and they can dump everything from the card (even if that takes a large number of attempts). A smart card may not be read – it may only be asked to respond to good requests. And, if the smart card detects a threat, it may stop responding, or even destroy itself. Today smart cards are standalone hardware devices but I believe one day all the data we produce and exchange will be protected like that.

 

2.   What Does It Mean for Data to Become “Smart”?

  • First, data will be produced and exchanged in Self-Contained Units of Data (SCUD). Each scud will carry sufficient code to perform all its public operations. That code must be executable on all major hardware/OS configurations. That doesn’t mean each Word document will embed WinWord.exe. Instead, a document will call a certified implementation of a standard WinWord API by a trusted vendor. Scuds are sandboxed based on their type, i.e. different sets of services are enabled for them based on their type.

 

  • Second, the content of every scud must be encrypted to protect its confidentiality, the scud as a whole must be hashed to prevent tempering with its content and/or code, and the scud may optionally be signed to guarantee its origin. Specifying a complete mechanism for protecting scuds is not a trivial problem, and is a good topic for another post. I’ll appreciate any help with that.

 

  • Third, services will become servants of scuds. Instead of performing actions on data, services will create favorable environment that attracts scuds to perform their own operations.

 

3.   Consequences

3.1.          Privacy

People will have to get used to the idea of having (and using and guarding) a truly private key (not the one from their employer) in order to prove the origin of their scuds and to be able to receive confidential scuds without malicious actors along the path impersonating them and thus intercepting scuds. Scuds could be further programmed to destroy themselves in an unfriendly environment.

3.2.          Viruses

Viruses will have their renaissance. Since all scuds will be executable, distinguishing between a real message and a virus will be a real challenge. It will be up to the platforms to properly sandbox scuds.  

 

4.   Data Workflows

Having said all that, let’s see what some popular data workflows will look like:

4.1.          Data Transportation

Services no longer transport data packets from node A to node B. Instead, a service on node A creates a socket to its corresponding service on node B. The socket attracts some scuds to travel from node A to node B. There may be multiple open sockets from node A at any given moment. Each socket attracts a different set of scuds (at a different degree).

4.2.          Web Browsing

A user creates a socket to a web server and requests a copy of a scud within the server’s scope. The web server asks the scud to replicate itself and attracts the copy in the socket to the browser. The copy itself may not be likely to be replicable/persistable. (The server may be configured to send the original scud – “drop box” scenario.) If the request is for the result of a servlet, that servlet creates a scud dynamically and the server attracts that dynamic scud in the socket to the browser.

4.3.          Email

Alice produces a mail scud that is automatically encrypted and hashed. She further signs it with her private key. Alice instructs the scud to take off towards Bob. Then the scud tries to find a local mail service to use as a vehicle for the first leg. The local mail service on her machine creates a socket to the nearest mail server that attracts all outgoing mail scuds. Once at the Alice’s mail server, the scud requests a socket towards Bob. A socket is created to Bob’s mail server that attracts the scud. After arrival at Bob’s mail server, the scud sits and waits to be attracted by a socket. When Bob logs onto his machine, his local mail service automatically creates a socket to the mail server. His user token in the socket attracts all mail scuds addressed to him. Thus the one from Alice makes it through the final leg.

4.4.          Permanent Email Address

In order to maintain a permanent email address where we could be reached regardless of employer and Internet service provider (ISP), today we have two options:

·         A free email server, e.g. Hotmail, Yahoo, Gmail, etc.

·         An email proxy, e.g. Source Forge.

Either option we choose, it’s not because of any merits but because it’s the lesser evil. Here is how it should be: Bob has a global identity. When Bob logs onto a computer, his session creates the most attractive environment for scuds targeted at his identity. His immediate ISP has the second most attractive environment, and so forth. User identity should propagate throughout the global network in a way similar to how DNS records propagate today. Thus, when Alice sends Bob an email, the scud follows the strongest attraction that should ultimately lead to Bob’s session. Sending a message to Bob does not require knowing and specifying his mail service provider. The only thing needed is the public view of his global identity.

 

5.   Timeline

The trend is obvious: object oriented programming is unquestionable, smart cards are very popular in Europe and slowly ramping up in US. Today the world is still at a stage where information and influence are insufficient, and the problem for people is how to gather more up-to-date information on time. In order for the above vision to turn into reality, the world must enter another stage where people want to keep their own information away not from individual hackers but from commercial telemarketers whose business is to intercept traffic and to sneak at private information; where people and network nodes choke on volumes of junk and start losing important information. We are not there yet but we are headed that way.