<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://blogs.msdn.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Zach Skyles Owens : Genetic Algorithms</title><link>http://blogs.msdn.com/zowens/archive/tags/Genetic+Algorithms/default.aspx</link><description>Tags: Genetic Algorithms</description><dc:language>en-US</dc:language><generator>CommunityServer 2.1 SP1 (Build: 61025.2)</generator><item><title>Ideas for Applying Data Mining in Academia</title><link>http://blogs.msdn.com/zowens/archive/2007/09/28/ideas-for-applying-data-mining-in-academia.aspx</link><pubDate>Fri, 28 Sep 2007 21:41:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:5191313</guid><dc:creator>ZachSkylesOwens</dc:creator><slash:comments>1</slash:comments><comments>http://blogs.msdn.com/zowens/comments/5191313.aspx</comments><wfw:commentRss>http://blogs.msdn.com/zowens/commentrss.aspx?PostID=5191313</wfw:commentRss><description>&lt;p class="MsoNormal" style="margin: 0in 0in 10pt"&gt;&lt;font face="Calibri" size="3"&gt;I recently interviewed &lt;/font&gt;&lt;a class="" title="Denny Lee" href="http://denster.spaces.live.com/default.aspx" target="_blank" mce_href="http://denster.spaces.live.com/default.aspx"&gt;&lt;font face="Calibri" size="3"&gt;Denny Lee&lt;/font&gt;&lt;/a&gt;&lt;font face="Calibri" size="3"&gt; from the &lt;/font&gt;&lt;a class="" title="SQL Customer Advisory Team" href="http://sqlcat.com/" target="_blank" mce_href="http://sqlcat.com/"&gt;&lt;font face="Calibri" color="#0000ff" size="3"&gt;SQL Customer Advisory Team&lt;/font&gt;&lt;/a&gt;&lt;font face="Calibri" size="3"&gt; about applying &lt;/font&gt;&lt;a class="" title="data mining" href="http://en.wikipedia.org/wiki/Data_mining" target="_blank" mce_href="http://en.wikipedia.org/wiki/Data_mining"&gt;&lt;font face="Calibri" size="3"&gt;data mining&lt;/font&gt;&lt;/a&gt;&lt;font face="Calibri" size="3"&gt; to medical research which you can see a &lt;/font&gt;&lt;a class="" title="Channel 8 Video" href="http://channel8.msdn.com/Posts/Denny-Lee-from-SQL-CAT-Data-Mining-applied-to-medical-research/" target="_blank" mce_href="http://channel8.msdn.com/Posts/Denny-Lee-from-SQL-CAT-Data-Mining-applied-to-medical-research/"&gt;&lt;font face="Calibri" size="3"&gt;video of on Channel 8&lt;/font&gt;&lt;/a&gt;&lt;font face="Calibri" size="3"&gt;.&lt;span style="mso-spacerun: yes"&gt;&amp;#xA0; &lt;/span&gt;I asked Denny about other areas of academic research where data mining could be applied.&lt;span style="mso-spacerun: yes"&gt;&amp;#xA0; &lt;/span&gt;I gave this question some thought on my own and one interesting area could be in &lt;/font&gt;&lt;a class="" title="genetic algorithm" href="http://en.wikipedia.org/wiki/Genetic_algorithm" target="_blank" mce_href="http://en.wikipedia.org/wiki/Genetic_algorithm"&gt;&lt;font face="Calibri" size="3"&gt;genetic algorithm (GA)&lt;/font&gt;&lt;/a&gt;&lt;font face="Calibri" size="3"&gt; research.&lt;/font&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="margin: 0in 0in 10pt"&gt;&lt;font face="Calibri" size="3"&gt;My advisor for my Computer Science degree was, &lt;/font&gt;&lt;a class="" title="Dr. Jeff Horn" href="http://cs.nmu.edu/~jeffhorn/" target="_blank" mce_href="http://cs.nmu.edu/~jeffhorn/"&gt;&lt;font face="Calibri" size="3"&gt;Dr. Jeff Horn&lt;/font&gt;&lt;/a&gt;&lt;font face="Calibri" size="3"&gt;, whose main area of research was GA.&lt;span style="mso-spacerun: yes"&gt;&amp;#xA0; &lt;/span&gt;I am by no means an expert on the topic, but I had the opportunity to work a bit with the technology.&lt;span style="mso-spacerun: yes"&gt;&amp;#xA0; &lt;/span&gt;GA&amp;#x2019;s are generally used to find &amp;#x201C;best fit&amp;#x201D; solutions to &lt;/font&gt;&lt;a class="" title="NP-Complete" href="http://en.wikipedia.org/wiki/NP-complete" target="_blank" mce_href="http://en.wikipedia.org/wiki/NP-complete"&gt;&lt;font face="Calibri" size="3"&gt;NP-complete&lt;/font&gt;&lt;/a&gt;&lt;font face="Calibri" size="3"&gt; problems in which it is &amp;#x201C;impossible&amp;#x201D; to test every combination of solution so a method for finding a best fit solution is needed.&lt;/font&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="margin: 0in 0in 10pt"&gt;&lt;font face="Calibri" size="3"&gt;&lt;a href="http://blogs.msdn.com/blogfiles/zowens/WindowsLiveWriter/IdeasforApplyingDataMininginAcademia_B3B6/dna_2.jpg"&gt;&lt;img id="id" style="border-right: 0px; border-top: 0px; border-left: 0px; border-bottom: 0px" height="244" alt="dna" src="http://blogs.msdn.com/blogfiles/zowens/WindowsLiveWriter/IdeasforApplyingDataMininginAcademia_B3B6/dna_thumb.jpg" width="100" align="left" border="0" /&gt;&lt;/a&gt; GA&amp;#x2019;s use a paradigm based on evolutionary biology where solution sets combine, crossover, mutate, etc.&lt;span style="mso-spacerun: yes"&gt;&amp;#xA0; &lt;/span&gt;The goal is to find a solution with the best &amp;#x201C;fitness&amp;#x201D;.&lt;span style="mso-spacerun: yes"&gt;&amp;#xA0; &lt;/span&gt;The GA&amp;#x2019;s we were using 10 years ago were passed a number of different startup parameters that affected the number of generations, chance of mutation and crossover, etc.&lt;span style="mso-spacerun: yes"&gt;&amp;#xA0; &lt;/span&gt;Depending on the problem set we would see the GA runs come up with varying solutions in terms of fitness. &lt;/font&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="margin: 0in 0in 10pt"&gt;&lt;font face="Calibri" size="3"&gt;Now you may be asking yourself &amp;#x201C;What about data mining?&amp;#x201D;&lt;/font&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="margin: 0in 0in 10pt"&gt;&lt;font face="Calibri" size="3"&gt;We would run these GA&amp;#x2019;s over and over and over logging and analyzing the results.&lt;span style="mso-spacerun: yes"&gt;&amp;#xA0; &lt;/span&gt;My suggestion would be to store the startup parameters and results of the runs or the performance of each generation into a data warehouse.&lt;span style="mso-spacerun: yes"&gt;&amp;#xA0; &lt;/span&gt;An example of the types of questions you could use data mining classification for are:&lt;/font&gt;&lt;/p&gt;  &lt;p class="MsoListParagraphCxSpFirst" style="margin: 0in 0in 0pt 0.5in; text-indent: -0.25in; mso-list: l0 level1 lfo1"&gt;&lt;span style="font-family: symbol; mso-fareast-font-family: symbol; mso-bidi-font-family: symbol"&gt;&lt;span style="mso-list: ignore"&gt;&lt;font size="3"&gt;&amp;#xB7;&lt;/font&gt;&lt;span style="font: 7pt &amp;#x27;Times New Roman&amp;#x27;"&gt;&amp;#xA0;&amp;#xA0;&amp;#xA0;&amp;#xA0;&amp;#xA0;&amp;#xA0;&amp;#xA0;&amp;#xA0; &lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;font face="Calibri" size="3"&gt;What are the clusters of ranges of startup parameters that provide the highest fitness?&lt;/font&gt;&lt;/p&gt;  &lt;p class="MsoListParagraphCxSpLast" style="margin: 0in 0in 10pt 0.5in; text-indent: -0.25in; mso-list: l0 level1 lfo1"&gt;&lt;span style="font-family: symbol; mso-fareast-font-family: symbol; mso-bidi-font-family: symbol"&gt;&lt;span style="mso-list: ignore"&gt;&lt;font size="3"&gt;&amp;#xB7;&lt;/font&gt;&lt;span style="font: 7pt &amp;#x27;Times New Roman&amp;#x27;"&gt;&amp;#xA0;&amp;#xA0;&amp;#xA0;&amp;#xA0;&amp;#xA0;&amp;#xA0;&amp;#xA0;&amp;#xA0; &lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;font face="Calibri" size="3"&gt;Is the algorithm generating clusters of solutions?&lt;span style="mso-spacerun: yes"&gt;&amp;#xA0; &lt;/span&gt;What are they and why are they being caused?&lt;/font&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="margin: 0in 0in 10pt"&gt;&lt;font face="Calibri" size="3"&gt;Although the questions above are only scratching the surface, I think researchers who live and breathe this stuff could really benefit from the power of data mining tools to complement their research, tune the algorithms, etc.&lt;/font&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="margin: 0in 0in 10pt"&gt;&lt;font face="Calibri" size="3"&gt;Do any readers out there have more ideas how data mining could be applied in academia?&lt;/font&gt;&lt;/p&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=5191313" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/zowens/archive/tags/SQL+Server/default.aspx">SQL Server</category><category domain="http://blogs.msdn.com/zowens/archive/tags/Genetic+Algorithms/default.aspx">Genetic Algorithms</category><category domain="http://blogs.msdn.com/zowens/archive/tags/Data+Mining/default.aspx">Data Mining</category><category domain="http://blogs.msdn.com/zowens/archive/tags/Research/default.aspx">Research</category></item></channel></rss>