I’ve seen a couple of these issues come from our customers so wanted to get the word out.
When crawling PPTX files with embedded links in 2007 generates the following error in the crawl logs:
“The filtering process could not be initialized. Verify that the file extension is a known type and is correct".
Before applying the fix, ensure this is the issue you’re running into by reviewing the crawl log and looking at the actual documents to ensure they have embedded links.
The fix is applying the Microsoft Office 2010 Filter pack on the 2007 server hosting the Index role: