Chris’s SharePoint Reflections

Just another weblog

  • Chris Zhong

    IT consultant Australia

  • Advertisements

Archive for the ‘Search’ Category

Leverage Search Crawl rules and Content Class property to refine SharePoint Search result

Posted by chrissyz on April 18, 2010

OOTB SharePoint search is quite powerful. The search engine of MOSS 2007 has only one physical index for each SSP in the farm. This implies that the content from all the content sources defined for the SSP is crawled into the same index. This provide the user’s ability to search across all the content using one query. But  in many search scenario, it also calls for mechanisms to automatically narrow user queries to a logical group of content within the physical index. I will introduce two methods here today to achieve that.

Use Search Crawl rules

Search Crawl rules are mechanism to influencing the behaviour of the crawler when it crawls specific sites. A single crawl rule is created by specifying a URL wildcard matching sites plus a set of options for setting the behaviour of the crawler for these sites

 For example: if you like all the document views and properties page to be excluded, you can use achieve it by configuring the crawl rule:

1. Go to the crawl rules section of the search setting in the SSP

2. Add crawls rules to exclude the following path:




You can also test a specific URL against the crawl rules to determine whether the rules will include or exclude the URL during a crawl. This feature is not available in SharePoint Server 2003.

In SharePoint 2007, wildcard operator “*” is the only operator supported in crawl rules foe matching everything. Because of its  nature that matches everything, it does not have the flexibility to, for example, recognize and omit URL that contain mobile phone number.

SharePoint 2010 includes new capability in this area to support regular expression in the URL

Check Microsoft Enterprise Search blog for more details

Use Content Class in scope

Most of you already know how to use sharepoint custom scope to fine tune your search result. For those who don’t know, there are plenty materials on the internet.:) I only want to get your attention on one of the managed property, contentclass. Essentially, every piece of SharePoint content seems to be tagged with this property. And as long as you know the internal name and its corresponding mapping, you should be able to configure your sharepoint search scope quite efficiently. For example, if you want to return all the documents and pages, you can set up the scope like this:

Below is a list of content class and its mappings prepared by Dan Attis in his blog. Should give you enough information to get started.

        case “STS_Web”:                             // Site
        case “STS_List_850”:                        // Page Library
        case “STS_ListItem_850”:                    // Page
        case “STS_List_DocumentLibrary”:            // Document Library
        case “STS_ListItem_DocumentLibrary”:        // Document Library Items
        case “STS_List”:                            // Custom List
        case “STS_ListItem”:                        // Custom List Item
        case “STS_List_Links”:                      // Links List
        case “STS_ListItem_Links”:                  // Links List Item
        case “STS_List_Tasks”:                      // Tasks List
        case “STS_ListItem_Tasks”:                  // Tasks List Item
        case “STS_List_Events”:                     // Events List
        case “STS_ListItem_Events”:                 // Events List Item
        case “STS_List_Announcements”:              // Announcements List 
        case “STS_List_Contacts”:                   // Contacts List
        case “STS_ListItem_Contacts”:               // Contacts List Item
        case “STS_List_DiscussionBoard”:            // Discussion List
        case “STS_ListItem_DiscussionBoard”:        // Discussion List Item
        case “STS_List_IssueTracking”:              // Issue Tracking List
        case “STS_ListItem_IssueTracking”:          // Issue Tracking List Item
        case “STS_List_GanttTasks”:                 // Project Tasks List
        case “STS_ListItem_GanttTasks”:             // Project Tasks List Item
        case “STS_List_Survey”:                     // Survey List
        case “STS_ListItem_Survey”:                 // Survey List Item
        case “STS_List_PictureLibrary”:             // Picture Library
        case “STS_ListItem_PictureLibrary”:         // Picture Library Item
        case “STS_List_WebPageLibrary”:             // Web Page Library
        case “STS_ListItem_WebPageLibrary”:         // Web Page Library Item
        case “STS_List_XMLForm”:                    // Form Library
        case “STS_ListItem_XMLForm”:                // Form Library Item
        case “urn:content-class:SPSSearchQuery”:    // Search Query
        case “urn:content-class:SPSListing:News”:   // News Listing
        case “urn:content-class:SPSPeople”:         // People
        case “urn:content-classes:SPSCategory”:     // Category
        case “urn:content-classes:SPSListing”:      // Listing
        case “urn:content-classes:SPSPersonListing”:// Person Listing
        case “urn:content-classes:SPSTextListing”:  // Text Listing
        case “urn:content-classes:SPSSiteListing”:  // Site Listing
        case “urn:content-classes:SPSSiteRegistry”: // Site Registry Listing


Posted in Search | Tagged: , , | 2 Comments »

Managed Property in SharePoint Search

Posted by chrissyz on February 17, 2009

Managed Property plays a very important role in MOSS 2007 Search. If used properly, you can greatly enrich the user search experience without spending a lot of effort. Before we go in the topic, let’s look at what is managed property.

The SharePoint search engine index both unstructured information, like text in word documents, http pages ectc and structured information (metadata), such as file size, file extension, title, author etc. We call this piece of metadataproperty. SharePoint operates with two kinds of properties: crawled property and managed property. Crawled properties are discovered by the crawler. Each SSP has a list of crawled properties in the metadata store in its SQL database. Properties are automatically added into the metadata store when the crawler crawling content. Thus, the number of crawled properties can be added up quickly. This crawled property list is shared across all content sources in SSP.

 Before MOSS 2007, SharePoint only operates with Crawled property. However, in reality, only a small subset of the crawled property is needed for searching and for displaying the search results. And you don’t have control of the name of the crawled properties. Crawler will name them itself by some internal naming rules. Thus, they can have lengthy names that would make them hard to reference in code or other places.

Managed property is a new concept introduced in MOSS 2007. They are created by SSP administrators and provide an easier and more consistent experience for managing and using the subset of crawled properties that matters to the business. And you can also map several crawled properties to the same managed property. This makes it possible to join different crawled properties that are semantically the same to a single managed property.

 Here are some examples of managed properties that are available in MOSS 2007,

  • AssignedTo
  • Author
  • ContentType
  • FileExtension
  • SiteTitle
  • etc..

Check the blog to see how you can create your own managed property. 

The use cases for managed properties:

  • Construct Search Query : The most straight forward way is to execute searches from the search box using managed properties. We can enter them directly into the search box. For example, I want to find all the documents written by me, I would execute a search in my MOSS site using the following text:”Author:Chris”
  • Customize Search Results: We are all familiar with OOTB search results that displaying the metadata, like title, author, URL etc. Actually, we can customized the search results to display any available metadata as long as the metadata is a managed property in the index
  • Expose for advanced search: The Advanced search page has property picker that can be populated managed properties. You can add managed properties to the property picker in the advanced search page by modifying the XML attaché to the Advanced Search Box for the properties. Here is a MSDN Visual How-to shows you step by step how to do that. Check it out!
  • Use in Search scope: You can use managed property to configure search scope. Here is a good blog of how to do it. 
  •  Custom relevancy ranking: Managed properties play a role within the ranking of the results.The managedproperty class exposes the Weight property which can be changed programmatically to influence the relevance ranking

Posted in Search | Tagged: | 3 Comments »

Search Query TotalRowsExactMinimum property

Posted by chrissyz on February 13, 2009

When we are coding against MOSS Search API, we use totalRows property to get the total search results from the search engine. However, why sometimes we found the totalRows number returned is not accurate. Why did that happen? And how could we fix that?

The reason for that is totalRows only give you an estimate result. It is very expensive to get a specific number due to the security trimming and search algorithms logic in SharePoint search. The trick is this property ‘TotalRowsExactMinimum’. It instructs the SharePoint search engine as to how accurate the TotalRows should be. It tells SharePoint the number of minimum hits that must be included in the result. The default value of it is 50. It is used in conjunction with totalRows. If you have more than 50 search results and you use TotalRows property with no modification of TotalRowsExactMinimum property, it will stop displaying further pages in the search results even though there are more records. Once you set the ‘TotalRowsExactMinimum’ more than the total results you estimated to return, Search engine will return the accurate total number. However, the higher you set ‘TotalRowsExactMinimum’, the more negative impact you will have on performance. Here is a very good blog gives you details explanation of how ‘TotalRowsExactMinimum” works.

Posted in Search | Tagged: | Leave a Comment »

Choose your Enterprise Search solution wisely

Posted by chrissyz on December 6, 2008

In the whole SharePoint technology family, Search is one of my favorites. J Enterprise Search plays more and more important role in the organizations. In the world of business, Search isn’t just about looking for information, it is about finding the content and applying the knowledge you gather and using it to benefit the business. It is about real people needing the right tools to help them get their jobs done. When you look from this perspective, you will understand the search experience you have on Internet can’t really satisfy the enterprise level requirement.

 The following are the key questions for you to consider before you implement search solution:

Ø  What sort of search solution do they after: web, desktop, Intranet

Ø  Do you want a search alone product or an Enterprise portal

Ø  What’s their UI requirement, result presentation

Ø  What’s your Enterprise Content sources

Ø  Security requirements

Ø  Do you need people search or LOB search

Let’s have a look at what Microsoft has offered lately:


Microsoft Search Server 2008 Express

Microsoft Search Server 2008

Microsoft Office SharePoint Server 2007

Search Center

˜ ˜

˜ Y

˜Y ˜

No Preset Document Limits


Extensible Search Experience


˜Y ˜

Relevance Tuning

   y ˜ ˜



Continuous Propagation Indexing




Indexing Connectors




Federated Search Connectors




Security Trimmed Results




Unified Administration Dashboard




Query and Results Reporting




Streamlined Installation




High Availability and Load Balancing




People and Expertise Searching



Business Data Catalog



SharePoint Productivity Infrastructure




I like to high light some key features in the MOSS 2007 Search that might makes a difference:

·         Relevance

Bear in mind that Enterprise Search is different from Internet Search in the link structure, cross-site hierarchy and security.  The Enterprise search algorithm is tuned for Enterprise content. It gets the most relevant results quickly. Meanwhile, the indexing engine used for all MS applications is from the common code base. This enables common functionality and extensibility between the applications that use it.


·         Integration

From integration point of view, MOSS 2007 Search is exposed as an XML web service which enables deep integration with office applications. Users can expose enterprise level search functionality from within their own office applications like Word, Excel and PowerPoint by displaying custom task pane or document information panel (DIP).


·         Comprehensiveness

Where users do have unobstructed access to structured data, differences in the search interface, syntax, and query methods can result in challenges both during the search for information and when interpreting results. MOSS 2007 provides a common Search framework, regardless of the information source, and make sure that the interface allows casual users to have easy access to complex data sets.


Ø  Effectively Search unstructured data

Although people generally have access to unstructured data, the process of finding it is often inefficient, with files in multiple locations (for example, multiple file shares containing duplicate copies and different versions of documents). MOSS 2007 can search all repositories across your enterprise (Windows File shares, Exchange Public Folder, Lotus Notes database, web content); It also extensible to include all types of files and to custom repositories.


Ø  Search structured data

Many organizations lock down much of their structured data, for fear of unauthorized users seeing more than they should — resulting in users being deprived of information that could be useful to them. Using MOSS 2007 and the Business Data Catalog it will be possible to index all the structured data stored in LOB like CRM, ERP, SAP and other databases and expose the data within the search results. This blending of structured data and unstructured data makes finding the right information easier and faster. Using the BDC, you have generic, reusable components in the form of the Business Data Web Parts. These components use the metadata repository to go out to the various business applications across the organization, retrieve the needed data, and present it in a single place. Also, this happens without the need to write code or compile binaries; all the information SharePoint needs to connect to the business application is stored in XML format in the metadata repository.


Ø  Knowledge Interchange  (People Search)

Getting a job done involves working with the right people, so it is important to find subject matter experts based on their knowledge and contacts. It is often difficult to find the specific experts within the company and this causes wasted time and effort in duplication of efforts. MOSS2007 enables intranet users to easily search over people and area of expertise within the enterprise. This ability to make connections is critical to improve stuff effectiveness in large enterprise environment. Opt for an integrated solution, whereby the user can easily make good use of real-time communications, to build relationships such as knowledge networks and project teams

·         Security

Look for solutions that provide custom security trimming, as well as standard features to help protect corporate information from unauthorized access. Find out how granular the administrative controls are and check for customizable interfaces, scalability, and extensibility

Ø  Only provide the result the user would allow to see

Search in MOSS 2007 provides query time security trimming and support pluggable authentication for content in WSS/MOSS site


Ø  Easy security management

Administrators can create user roles that determine the kind of information that can be viewed by users during a search. This access control can be broad or granular as defined by the corporation. All of these tasks are administered through the Central Administration and SharePoint Service Portal interfaces, making security administration more usable and efficient.


·         Extensibility and scalability

There are several ways in MOSS 2007 to customize content. The search center interface can be easily branded to reflect the identity of your organization visually and additional tab can be put in place to reflect the difference variations of the users searched most often. Familiar look and filter the users and provide quick access to a specific application, database or Directory anyway in your enterprise. By adding search enabled web part to the personalization template provide by SP, query result is only relevant for each individual user. And the Search Admin API and Query API allows developers develop custom search application to cater to meet specific business requirement throughout enterprise


Posted in Search | Tagged: | 2 Comments »