Imagine Think Create Share

Office | SharePoint | Search | Architecture | Stuff
Blog - Title

March, 2009

  • Imagine Think Create Share

    Sharepoint: supportability scenarios

    • 0 Comments

    Today I posted some samples about Search and Powershell integration. I mention about the supportability of querying sharepoint tables. I thought about it and I wanted to clarify my point of view and what is public & published.

    I have listed the main areas (I'm primarly focusing on Office12) where I have seen supportability questions. If you have any question, please call your support contact or access http://support.microsoft.com/

    On the other hand, there are articles as SharePoint Database Access where it is recommended to not query database but instead use the object model.

    Looking back to the scenario of my original comment if you are debugging with the SQL Profiler your SSP Database and at the same time running the powershell script or getting the crawl history, you will see something like:

    • SP:Starting Event
      • exec dbo.proc_MSS_GetCrawlHistory @ContentSourceID=NULL,@MaxRecords=NULL,@BeginTime=NULL,@EndTime=NULL,@CrawlStatus=NULL
    • SP:StmtCompleted
      • SELECT A.CrawlID,A.CrawlType,B.ContentSourceID,A.Status,A.StartTime,A.EndTime,CAST(A.SuccessCount as int) as SuccessCount,A.ErrorCount,A.WarningCount from MSSCrawlHistory as A inner join (select distinct CrawlID,ContentSourceID from MSSCrawlContent) as B on A.CrawlID = B.CrawlID WHERE A.ProjectID = 1 AND DATEDIFF([Day], A.StartTime, GETDATE()) <= 7 ORDER BY A.EndTime DESC

    So you may think: "as these tables are not heavy used, a directly querying with some changes may have less locking, we should follow that", something that you may test only in you lab/dev environment.

    SELECT A.CrawlID,A.CrawlType,B.ContentSourceID,A.Status,A.StartTime,A.EndTime,CAST(A.SuccessCount as int) as SuccessCount,A.ErrorCount,A.WarningCount from MSSCrawlHistory as A with(nolock) inner join (select distinct CrawlID,ContentSourceID from MSSCrawlContent with(nolock)) as B on A.CrawlID = B.CrawlID WHERE A.ProjectID = 1 AND DATEDIFF([Day], A.StartTime, GETDATE()) <= 7 ORDER BY A.EndTime DESC

    Just thinking in the task, this may seems better, but think about the possibility that committed data may be a requirement.

    Finally, should you TSQL your sharepoint database? I would say no, as:

    • You may affect overall response of your environment
    • Unless you have enough information through our open specification program you may conflict with you EULA acceptance.

    What would be better, querying or using the object model?

    • Working with the object model gives you supportability for your code as you won't have breaking changes (backward compatibility) or you will be advised about deferred/obsolete code and have migration paths. something that using TSQL you won't have.

    Namaste!

  • Imagine Think Create Share

    How to: Programmatically Manage the Crawl of a Content Source in PowerShell

    • 1 Comments

    Hi,

    If you have read my previous post you may think, why did you stop there? Well, that is what I thought too :), and started with this article from MSDN.

    We wanted to start managing the crawling of our content source more programmatically, as we have seen that running several at the same time affects the overall process.

    So running these powershell scripts as scheduled task and monitoring the status can improve in crawling.

    You may improve them reusing the context and content sources and doing some pipeline 

    to be included in the overall file:

    ## SharePoint Reference
    [void][System.Reflection.Assembly]::LoadWithPartialName("Microsoft.SharePoint")
    [void][System.Reflection.Assembly]::LoadWithPartialName("Microsoft.SharePoint.Administration")
    [void][System.Reflection.Assembly]::LoadWithPartialName("Microsoft.Office.Server.Search.Administration")
    [void][System.Reflection.Assembly]::LoadWithPartialName("Microsoft.Office.Server.Search")
    [void][System.Reflection.Assembly]::LoadWithPartialName("Microsoft.Office.Server")

    To start an incremental crawl of the content source

    function global:StartIncremental-Crawl($url, $csname)
    {
     trap [Exception] {
      write-error $("ERROR: " + $_.Exception.GetType().FullName);
      write-error $("ERROR: " + $_.Exception.Message);
     
      continue;   
     }

     $s = new-Object Microsoft.SharePoint.SPSite($url);
     $c = [Microsoft.Office.Server.Search.Administration.SearchContext]::GetContext($s);
     $sc = new-Object Microsoft.Office.Server.Search.Administration.Content($c);

     $cs = $sc.ContentSources[$csname];

     $cs.StartIncrementalCrawl();

     $s.Dispose();
    }

    StartIncremental-Crawl -url http://your_site_url -csname "your content source name"

    To start a full crawl of the content source

    function global:StartFull-Crawl($url, $csname)
    {
     trap [Exception] {
      write-error $("ERROR: " + $_.Exception.GetType().FullName);
      write-error $("ERROR: " + $_.Exception.Message);
     
      continue;   
     }

     $s = new-Object Microsoft.SharePoint.SPSite($url);
     $c = [Microsoft.Office.Server.Search.Administration.SearchContext]::GetContext($s);
     $sc = new-Object Microsoft.Office.Server.Search.Administration.Content($c);

     $cs = $sc.ContentSources[$csname];

     $cs.StartFullCrawl();

     $s.Dispose();
    }

    StartFull-Crawl -url http://your_site_url -csname "your content source name"

    To pause a crawl in process

    function global:Pause-Crawl($url, $csname)
    {
     trap [Exception] {
      write-error $("ERROR: " + $_.Exception.GetType().FullName);
      write-error $("ERROR: " + $_.Exception.Message);
     
      continue;   
     }

     $s = new-Object Microsoft.SharePoint.SPSite($url);
     $c = [Microsoft.Office.Server.Search.Administration.SearchContext]::GetContext($s);
     $sc = new-Object Microsoft.Office.Server.Search.Administration.Content($c);

     $cs = $sc.ContentSources[$csname];

     $cs.PauseCrawl();

     $s.Dispose();
    }

    Pause-Crawl -url http://your_site_url -csname "your content source name"

    To resume a paused crawl

    function global:Resume-Crawl($url, $csname)
    {
     trap [Exception] {
      write-error $("ERROR: " + $_.Exception.GetType().FullName);
      write-error $("ERROR: " + $_.Exception.Message);
     
      continue;   
     }

     $s = new-Object Microsoft.SharePoint.SPSite($url);
     $c = [Microsoft.Office.Server.Search.Administration.SearchContext]::GetContext($s);
     $sc = new-Object Microsoft.Office.Server.Search.Administration.Content($c);

     $cs = $sc.ContentSources[$csname];

     $cs.ResumeCrawl();

     $s.Dispose();
    }

    Resume-Crawl -url http://your_site_url -csname "your content source name"

    To stop a crawl of the content source

    function global:Stop-Crawl($url, $csname)
    {
     trap [Exception] {
      write-error $("ERROR: " + $_.Exception.GetType().FullName);
      write-error $("ERROR: " + $_.Exception.Message);
     
      continue;   
     }

     $s = new-Object Microsoft.SharePoint.SPSite($url);
     $c = [Microsoft.Office.Server.Search.Administration.SearchContext]::GetContext($s);
     $sc = new-Object Microsoft.Office.Server.Search.Administration.Content($c);

     $cs = $sc.ContentSources[$csname];

     $cs.StopCrawl();

     $s.Dispose();
    }

    Stop-Crawl -url http://your_site_url -csname "your content source name"

    To check the crawl status values for a content source

    function global:Get-CrawlStatus($url)
    {
     trap [Exception] {
      write-error $("ERROR: " + $_.Exception.GetType().FullName);
      write-error $("ERROR: " + $_.Exception.Message);
     
      continue;   
     }

     $s = new-Object Microsoft.SharePoint.SPSite($url);
     $c = [Microsoft.Office.Server.Search.Administration.SearchContext]::GetContext($s);
     $sc = new-Object Microsoft.Office.Server.Search.Administration.Content($c);

     Write-Output $sc.ContentSources;

     $s.Dispose();
    }

    Get-CrawlStatus -url http://your_site_url | Format-Table -property CrawlStatus, CrawlStarted, CrawlCompleted

    Enjoy!

     

  • Imagine Think Create Share

    How to: Programmatically Export the Crawl History to a CSV File in PowerShell

    • 6 Comments

    Hi,

    When I came across the article at MSDN How to: Programmatically Export the Crawl History to a CSV File I thought I would never create such a tool just for that specific feature, as you end up with additional requirements in order to create an admin tool.

    But today I needed to get data from crawl history, and I didn't want to get them from SQL (remember it is not supported ;)), so I started to write down a simple powershell script to do it. And then I realized that for this atomic actions, indeed it is a great options!: you give admin people multiple commands that they can use/combine to monitor/get information about the environment (and yes many, many things more)

    ## SharePoint Reference
    [void][System.Reflection.Assembly]::LoadWithPartialName("Microsoft.SharePoint")
    [void][System.Reflection.Assembly]::LoadWithPartialName("Microsoft.SharePoint.Administration")
    [void][System.Reflection.Assembly]::LoadWithPartialName("Microsoft.Office.Server.Search.Administration")
    [void][System.Reflection.Assembly]::LoadWithPartialName("Microsoft.Office.Server.Search")
    [void][System.Reflection.Assembly]::LoadWithPartialName("Microsoft.Office.Server")

    function global:Get-CrawlHistory($url)
    {
     trap [Exception] {
      write-error $("ERROR: " + $_.Exception.GetType().FullName);
      write-error $("ERROR: " + $_.Exception.Message);
     
      continue;   
     }

     $s = new-Object Microsoft.SharePoint.SPSite($url);
     $c = [Microsoft.Office.Server.Search.Administration.SearchContext]::GetContext($s);
     $h = new-Object Microsoft.Office.Server.Search.Administration.CrawlHistory($c);

     Write-OutPut $h.GetCrawlHistory();

     $s.Dispose();
    }

    Then you can just execute: Get-CrawlHistory -url http://your_site_url/ | Export-Csv your_path_and_file_name

    Then you can import to excel and make some charts.

    In order to filter the information some useful columns should be denormalized: CrawlType, ContentSourceID, Status.

    Cheers!

Page 1 of 1 (3 items)