This isn't an every day kind of blog article - it's something for data architects, or people who need to think about securing sensitive data - like student performance data or financial information - across a large education organisation.

There are huge amounts of data being collected in education. A lot of it isn't sensitive, but some of it should only be accessible to some people in the organisation. For example, if you are collecting medical information on students, or addresses of students in specific categories. Education has traditionally had a habit of protecting this information by limiting availability (often by only having it on paper!), but the growth of large collections of data, which can become more sensitive as the database size grows, means that you need to carefully think about the protection and access to the data.

Our own IT team at Microsoft have exactly the same problem, and use standard classifications to group and protect data that contains financial and personally identifiable information (PII). They've implemented a system that automates much of the work of classification and protection (for example, by automating classification they have reduce the error rate of misclassification from 30% to 3%). The benefits they describe are:

  • Mitigating risks
  • Reducing total cost of ownership
  • Streamline, automated process
  • More granular view of data
  • Faster, more accurate tagging
  • Improved security through persistent protection

I would bet that almost every significant education institution in Australia has got the same need. You can read the full Microsoft IT case study, about how it was implemented within Microsoft's internal systems, to understand what the team did, and the challenges they faced in doing it.