I was recently working on a new installation for a client and convinced them to put their installation on a 64-bit machine. Their installation was only a single server install, but I still set up service accounts for all the bits (more specifically, I set up SPService for the app pool account and SPSearchReader for the content crawling account). After I got SharePoint installed and running, I went after the search to make sure it worked with all the necessary file types, PDF being a very important one.
In December of 2008, Adobe released their 64-bit iFilter 9, so I thought I'd give it a try. Adobe has an installation procedure, but for some reason I overlooked it on the first try.
After installing the iFilter I added the PDF extension to the list of file types to be crawled. I also installed the Acrobat reader on the machine so I could open the documents after I found them. I started a full crawl and ended up with about 3000 items in the search index from my test set of data. "Good," I thought. As I started performing some searches I found that I couldn't find any PDF documents. So, I checked the crawl logs. Here's where I found the message, "The filtering process could not process this item. This might be because you do not have the latest file filter for this type of item. Install the corresponding filter and retry your crawl." Well, I had just installed the iFilter, so what was up? It started smelling like a security problem
One of the first security issues that sounded plausible was the security settings around the DCOM components. This is where your service account needs to be able to launch the IIS WAMREG admin Service. So, I launched dcomcnfg and edited the security to allow local launch and local activation for my two service accounts, SPService and SPSearchReader. I ultimately removed SPSearchReader, as it was not necessary.
After making these changes and retrying crawls multiple times and even rebooting the server, I was still getting the same error message. What were the potential issues? I thought about and researched several potential issues that came to mind.
- User rights assignment for the SPService or SPSearchReader accounts?
- DCOM permissions for IIS WAMREG admin Service.
- DCOM permissions for the Adobe iFilter thunking component (but this is native 64-bit, right? No more thunking.).
- Other Adobe DCOM object that needs permission updates?
- Directory permissions for the directory where the iFilter is installed? It already had read/execute for authenticated users.
- Adding the iFilter directory to the PATH environment variable?
I added the directory path where the iFilter is installed to the PATH environment variable. In this case, I added C:\Program Files\Adobe\AdobePDF9iFilter64\bin. I even rebooted the server. Same result.
I resorted to making SPService a member of the administrators group. Of course it worked after doing that. That's the answer to every security problem, right? I took the SPSearchReader out of the administrator group and it continued to index PDFs. So, I took SPService out of the administrator group and lo and behold, it continued to work!
In one of the steps of the Adobe installation procedure (4.b.i), they mention changing the key \\HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\12.0\Search\Setup\ContentIndexCommon\Filters\Extension\.pdf to the GUID {E8978DA6-047F-4E3D-9C78-CDBE46041603}. The original value was {4C904448-74A9-11D0-AF6E-00C04FD8DC02}. This appears to be a very important step! I tested the crawl using the new value of the GUID provided and the value that was already there. My tests showed that, if you don't update the GUID, the iFilter won't be able to index the PDF content. Perhaps this was my problem all along! I'm not completely sure and the only way I can make it fail now is by changing this registry key back to its original value.
After the PDF iFilter was working, I also installed the Microsoft iFilter pack and registered it with MOSS, according to article 946336. This is where I found there is a problem with the 64-bit Visio iFilter. Supposedly there is/will be a hotfix, but I couldn't find one.
8 comments:
Hi there, i had the same issue with the GUID and no indexing the pdf content. After changing the GUID it works like a charm. Thanks for the hint. obviously i overread that part in the installation procedure and was wondering why the content wasn't indexed.
Regards,
Tim
Does the iFilter only need to be installed on the Index server (we have 2 WFE and an dedicated Index server)? They don't really go into multi-server environments in the install.
The installation of the iFilter has to be done on the Index server alone. The pdf icon has to be in the 12 hive of every WFE also the docicon.xml had to be changed on every WFE.
The Visio iFilter fix for X64 is KB960502. http://support.microsoft.com/kb/960502
You are the Man !!! worked for me , i was doing googling for 3 days that why my PDF search doesnt work ,
Thanks
Regards
Thanks for this! I couldn't get this little devil to full text search until changing the GUID in the registry. As for your question regarding what truly fixed it, the only steps I performed were:
1. Add bin directory to PATH
2. Change GUID in registry
3. Full crawl
4. Sigh and finally put issue this to bed.
Qusetion,
Windows Server 2003 R2 (std x64)
SharePoint Server 2007 Std w/SP1
incl/SQL Server 2005 (32 bit)
Can I use the 64 bit iFilter or do I need to use the 32 bit?
I haven't had any luck w/ indexing pdf's yet. Thx
This is exactly what I needed. Saved me a lot of headache trying to figure it out. A note, this is not limited to MOSS - I did the same steps on WSS 3 and had to change the GUID as directed here. The keys within the registry are in a slightly different hierarchy, but the keys themselves are exactly the same as described here.
So if you are having trouble with WSS 3, try the GUID fix.
Post a Comment