How to: Configure Enterprise Search to index a file share
Posted
Wednesday, July 1, 2009 9:53 AM
by
CoreyRoth
I am sticking with my series of introductory Enterprise Search topics today by writing up some details on how to index a file share. Setting up a file share index is pretty simple, but there are a few things to know, so that is the point of today’s post.
The first step of indexing a file share is identifying your crawl account. This is the account that will be used to index the file share (unless specified differently with a crawl rule) and therefore will need read access to the file share. Start by granting read access on this account to any folder, subfolder, and file that you want indexed. Any folder this account doesn’t have access to will be excluded. If you are not too familiar with how permissions work on file shares, there are two places that an account must have permission: the Sharing tab and the Security tab. You use the Security tab to grant access to an account on the file system itself. This would be the same if that user is logged into that machine directly and trying to view the files. The Sharing tab is what permissions the user has when accessing that folder over the network. In order for an account to be able to read files over the network, the user must have read permission on both tabs. Here is an example of what mine looks like for my crawl account MOSS_Setup. Note: that screenshot is from Windows Server 2008. Pervious versions looked a bit different.
Security Tab with read access:
After you have configured permissions on your account, you need to go to the SSP –> Search Administration –> Content Sources. Create a new content source and give it a name. I called mine File Share in this case. Then you need to specify a start address. You can specify the path as file://server/share or \\server\share. Enter the path to one or more file share sand then save the content source. You can also specify whether or not to index subfolders or not here. This is what my file share looks like.
One thing to note before crawling is that, it will only index file types that you have allowed on the File Types page. For example PDF is not included by default. Add any extensions that you might need. If you need to add any file types, specify the extension without the period (i.e.: pdf not .pdf). You can also add file types programmatically. This alone is enough to get it indexed, but if you want the contents of each file indexed, you will also need to install an appropriate IFilter for any new file type.
Once your file types are in order, you are ready to begin a full crawl. After the crawl is completed, view the Crawl Log and verify that your files were indexed. If there was a permissions problem or any other issues accessing the file share, you will see it here. At this point you can go to your search center and try a search. If all goes well, you should see some search results. To see what got indexed, you can easily write a keyword query to show everything in the content source. For example:
ContentSource:”File Share”
The results would look something like this.
As you can see it’s pretty simple to index file shares. For more information on querying by content source, check out this post.
Follow me on twitter.