Search is Everywhere! What you need to know about Search in SharePoint 2013 Preview
Posted
Tuesday, July 17, 2012 9:37 PM
by
CoreyRoth
As I start writing this post, I know this is going to be one of those posts that covers so much that it is going to hard to cover it all. The point of this post is to give you a high level idea of everything new and changed in Search with SharePoint 2013 Preview. Search is everywhere in SharePoint now so it is important to understand how it has changed. For example, take a look at this document library utilizing the power of Search.
Aside from cool stuff like drag and drop, document libraries directly leverage the search engine to allow users to filter documents easily. Gone are the days of relying on CAML for simple document library searches. You’ll find out why later in this post on why you can rely on search for those uses as well.
In SharePoint 2013 Preview, Search was essentially rewritten from the ground up. You’ll see familiar concepts like managed properties as well as how FAST Search for SharePoint morphed into this new product. The concept of FAST Search for SharePoint servers is gone. The components from FS4SP have made their way directly into SharePoint and we don’t need separate dedicated servers for it (necessarily). If you’re familiar with FAST ESP, you will see some familiar components from there too. Don’t worry, we aren’t starting over from scratch, but you will see some exciting new things that make search such a powerful feature of SharePoint.
Today’s post will provide a high level of many of the new concepts in search. It will serve as a springboard for a series of detailed articles about the individual components of Search that I will post in the coming weeks. We’ll cover Search in the following areas: topology, crawling, querying, user interface, API, and SharePoint Online.
Topology
The underlying search topology has changed quite a bit. However, most of it is based upon concepts you may have seen before from FAST Search for SharePoint. Components can be scaled out to multiple servers as needed. These changes can be done through the Search Service Application or through PowerShell. When making changes to the topology, you don’t change the active topology, you clone the original, make changes, and then change the active topology. This section is pretty technical so feel free to skip it if you are only interested in the cool stuff like querying and the user interface. :)
Components
The first thing to cover is how the topology changed. Many components got new (but similar) names and they correspond to FAST Search for SharePoint components. The Search components are hosted on your SharePoint application servers using a Search Service Application. Components can be scaled to multiple servers for performance and redundancy. The components that make up search are:
Crawl Component
The crawl component crawls the actual data from a variety of sources such as SharePoint, File Shares, User Profiles, and Databases using BCS.
Content Processing Component
This component processes crawled items and feeds them to the index component. This is where document parsing occurs as well as IFilters exist. A generic IFIilter will cover most of your needs. It also is responsible for language detection and entity extraction (both of which are features from FS4SP). It also produces the phonetic name variations for people search.
Index Component
If you are familiar with FAST Search for SharePoint, the index component will look pretty similar. The Index Component is used in both feeding and query processes. It takes items from the content processing component and writes them to the index. It also receives queries from the query processing component and returns result sets. The Index architecture is based off the rows and columns concepts in FS4SP. Index Replicas (rows) provide a level of redundancy with groups of servers. Index Partitions (columns) allow you to split the index between servers.
Analytics Component
The Analytics Component analyzes crawled items and how users interact with Search Results. It truly is a part of what makes search “learn” and provide better search results to the user.
Query Processing Component
This component performs linguistic processing at query time such as work breaking, stemming, spell checking, and the thesaurus. When the query comes in, it completes it’s processing and passes it to the query component.
Search Administration Component
The administration component stores the various information about search that you configure through the user interface in the Search Service Application. It also manages topology changes.
Search Processes
The search processes look a bit different than what you may be used to. The Host Controller is a Windows service that manages various processes called NodeRunners. When you first install SharePoint 2013 Preview and you wonder where all your memory is, you’ll see multiple NodeRunner.exe processes at the top of the list. Each NodeRunner.exe hosts one of the various components above. Looking at the task manager, it is not obvious which process is running which component. There is a PowerShell script that will tell you, that I will post in the future. Lastly, MSSearch.exe is the Windows Service that hosts the crawl component.
Crawling
The configuration of crawling looks similar but there are lots of changes. Many of the configuration changes you make aren’t limited to the Search Service Application any more. Changes can be made at the site collection and some can even be made at the site.
Content Sources
Content Sources are configured in much the same way, but there are some changes. The newest feature here is the concept of Continuous Crawling. This crawls your content source continuously (every 15 minutes by default). However, there is some magic that occurs now and new items can appear in the index within seconds. This is something users have always wanted and I am really excited about it. It also means that when the full crawl is executing, you can see changes to the index while it is still running.
Result Sources
Results Sources effectively combine Scopes and Federated Locations into one interface. However, they added a ton of new features in how you can build the queries that make up the result source. In a new instance, quite a few result sources are available out-of-the-box such as Local SharePoint Results, Popular, and Items Matching a Content Type. Here is what it looks like.
Whereas SharePoint 2010, only had protocols for Local Search and OpenSearch 1.1, SharePoint 2013 Preview (as well as SharePoint Online Preview), support for Remote SharePoint servers and Exchange has been added. Since SharePoint and People Search results are served by the same search index now, you can choose which type of results you want here too.
Scrolling down the page, you have the ability to use the new Query Builder to construct a query. This new interface provides a lot of ways to create custom queries very easily.
The query builder lets you quickly construct a query even with dynamic values.
After you construct your query, you can click the TEST tab to see if the query works. There is so much to cover with the Query Builder, it will gets it own post in the near future.
Document Parsing
A number of improvements have been made to document parsing as well. New high performance IFilters exist for common Office document formats as well as images and PDF. This means you won’t have to manually configure that IFilter any more. However, the existing IFilter interface is available still in case there is anything you want to add.
Entity Extraction
In FS4SP, entity extraction was managed using a set of XML files. In SharePoint 2013 Preview, this has been moved to the term store. The out-of-the-box entity extraction will automatically extract company names out of documents. You can use the term store to manage exclusions and inclusions. Unlike FS4SP, it doesn’t look like you are able to add your own term sets for entity extraction, so this is unfortunate.
Schema Management
Managed Properties and Crawled Properties are now referred to as the Search Schema. Managed Properties have a number of new parameters such as allowing for sorting and refining that we gained from FS4SP.
Site Collection administrators now also have the ability to make changes to the search schema at their level. This allows you to delegate some of the search configuration to admins and let them override settings without affecting things globally.
Export and Import
Another one of my favorite features in Search is that we finally have the ability to export and import search configurations. I’ve only been asking for this since 2007. :) You can export your search configuration as XML and then import it later. This can even be done at the site collection level.
Crawl Log Permissions
There are times where you may want to grant access to the crawl logs to non-administrator users. The new setting allows you to grant other users access.
Querying
We’ll now look at some of the improvements when querying search.
Query Spelling Correction
Customizations to the Spelling Correction is now managed through the term store as well. This allows you to customize the “Did you mean?” functionality.
Query Rules
The new query rules engine lets you tailor your query results in ways never before possible. Each rule is composed of conditions and actions. When the conditions are met, one or more actions are implemented. Actions include things like promoting a result (similar to a best bet) and injecting a result block into the search results (basically changing the way search results look). I think a screenshot of the out-of-the-box query rules actually explains it better than I do.
In this example, if it finds a person name in the Local SharePoint Results, it will promote a result block showing people that matched the result. You can create your own query rules to really customize how search results look. You can even put start and end dates on a query rule.
User Interface
The User Interface has pretty much been rewritten in its entirety. It starts with the addition of the ResultScriptWebPart which retrieves and displays search results. This replaces the CoreResultsWebPart from SharePoint 2010 and has a ton of new functionality. Take a look at the search center in this example:
There is a lot to take in here. In my example here, I had a number of PowerPoint presentations returned. When I hover over one of them, it gives me a large visual preview of the slide deck that I can flip through. On top of that, it picked up the key sections of the document and listed them under “Take a look inside”. At the bottom of the preview there are useful links such as Follow (Social feature), Edit, View Library, and Send. What’s cool about the document preview is that it lets you scroll through the entire document. It even shows the animations in PowerPoint decks. If you used this functionality at all before with FS4SP, you might have been hit with the fact that document previews didn’t work against documents sitting on a claims-enabled web application. Note, that previews only work with claims authentication now (along with many other things).
There are a few remaining features to point out. On the left, you see some of the new visual refiners that allow you to search by different modified dates. The search box at the top also provides options to easily jump between documents, people, conversations, videos, and reports.
Result Types
In the above screenshot, you might have noticed that the PowerPoint results are formatted a certain way. This is through the new Result Type feature that allows you to customize how a particular result looks based upon a condition. As someone that customizes search, if there is anything to get excited about, this is it. Result Types are comprised of Rules, Properties, and Display Templates. The Rules define when the result type should be used (i.e.: Excel Documents, People, or Picture Library). When one of these rules matches, that Result Type will be used to display the individual result in a unique way. Properties refer to managed properties and these are what you will use in your display template to show the data from the result. Here is what the Result Types page looks like in the Site Collection.
You can edit the built-in result types, but you can create new ones. Before you define your result type, it is a good idea to create a new display template as you will have to select it from a list when creating the result type. The Display Templates are .js files and are kept in the ~sitecollection/_catalogs/masterpage/Display Templates/Search folder.
I’ll post soon about how to configure result types, but take a look at another example. This time I have a mixture of different document types. Most result types look the same out of the box, but you can customize them heavily to meet your needs.
Search Refinement
The refiner web part has some added functionality. It has the most of refinement included in SharePoint 2010. However, for those of you who used to use FAST ESP, we now have faceted navigation. This allows you to use refinement before you ever issue a search and is based upon data in the term store. Think of how BestBuy.com uses FAST ESP to allow you to select TVs –> >50” –> Plasma, etc. We can also leverage display templates to change how the refinement is rendered. This makes use of the new Refinable attribute that we see on managed properties. For example, that is how you see the data slider.
Query Suggestions
Query Suggestions have been improved largely through the use of the analytics component to analyze your personal search history. It actually weights results based on links you have previously visited. It also looks at the most frequent queries of all users to deliver better suggestion. There are two types of query suggestions: what you see before issuing the query and what you see along with the results. For the pre-query suggestions, you will get suggestions from your personal query log along with what other people have been searching for. For the suggestions after you get results back, it returns matches that you have clicked on at least twice. I’ll post more on this later after I have an environment that has been up long enough to capture some of this data.
API
The Search API has underwent a series of changes. Two new interfaces are available, while one was removed and another deprecated. If you write custom search code, you will want to pay attention to this section.
New Interfaces
The SharePoint REST API got some love in this release. In terms of search, we have a whole new interface for querying using REST. This is possible by calling the endpoint located at /_api/search. You can specify any site collection or site in the URL, but typically you’ll just go with the web application root URL. Specify any other URL prefix will get you the same results as well. In one of my upcoming posts, we’ll go into some real examples of how to use this new endpoint.
Search also got some love in the Client OM. That means you can now execute search queries using CSOM via JavaScript or .NET. According to the MSDN post, you can do mobile development as well, but I’m unsure if that includes Windows Phone. I am thinking it doesn’t but you can still use the REST API there. I’ll confirm that as I begin writing my posts on the API.
Removed and Deprecated
The Search API has been expanded greatly and you have some great new options to use. However, there are a few other changes you need to know about. First, and foremost, the SQL Syntax has been removed. I’ve been telling you for years to stop using it. At SPC09, they said it was unofficially deprecated and now I can say straight up that it is gone now. You also need to know that the Search web service (search.asmx) is now deprecated (but not gone) as well. That means you need to stop using the web service as it won’t work some day. If you need to remotely access Search, then use the Query CSOM or the Query REST Service which are much better and featured.
New Operators
If you have been using FAST Search for SharePoint, these three new operators will be nothing new to you. However, if you were strictly running SharePoint 2010, they may be of interest. Previously the XRANK operator was only available in FQL. Now we can use it in regular keyword queries and it gives us the ability to dynamically adjust the rank of items. The NEAR operator has been improved to include a configurable token distance (besides the default of 8) and a new ONEAR operator allows for ordered near functionality. Most of these operators are pretty hardcore so most people probably will never use them but they are there if you need them. You can also continue to use FQL if you prefer.
SharePoint Online
As Microsoft works to bring feature parity to SharePoint Online, the preview brings us a heap of new features in Search. Whereas you could configure next to nothing in the previous iteration of SharePoint Online, you can do just about anything with Search now. This comes from the new Search link inside tenant administration. From the list you see below, you can do just about anything except configure Content Sources. These are still handles automatically by SharePoint Online so we can’t change the frequency of crawls nor can we crawl other sources such as HTTP or BCS. Take a look at the Search settings in Tenant administration to get a feel for what you can do.
Scrolling down, we can adjust a few more settings.
I’ve already talked about Export and Import. The new Search Center Settings link allows you to set a global search center that will be used on each site. The feature parity in Search with SharePoint Online Preview is impressive. In fact, all of my screenshots for this article came from the cloud.
Summary
As I expected, this post has proved to be quite long. I tried to be brief in each section so as I could cover as much as possible. Anything in bold in this article will likely be a follow-up blog post, so stay tuned. I expect to find information that needs to be updated or points that I left out. I’ll be posting updates to this post as necessary. Anyhow, I hope this post has proved useful in explaining what you need to know about Search in SharePoint 2013 Preview.
Follow me on twitter at @coreyroth.