icon Learn how to get the most out of Surmunity - read our forum tips here! | Welcome! Please register to access all of our features.

» Surpass Web Hosting Forums » Discussions » All Things Techy » Cpanel and search indexer

All Things Techy Everything else that doesn't fit in the other categories!

Reply
 
LinkBack Thread Tools Search this Thread
Old January 25th, 2008, 3:03 PM   #1 (permalink)
Registered User
Fresh Surpasser
 
Joined in Mar 2007
21 posts
Gave thanks: 3
Thanked 0 times
Question Cpanel and search indexer

Hi all,

I am using joomla and having a file repository (doc, pdf...) and i want to let my visitors searching on the file content.

Is CPANEL compatible with Swish-E (search indexer) or is there another solution that i can implement for that ?


Thanks a lot for your help.
mirco is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old January 25th, 2008, 4:41 PM   #2 (permalink)
Yabadabadoo
Super #1
 
Geoff's Avatar
 
Joined in Nov 2004
Lives in B.C., Canada
Hosted on Dedicated
1,011 posts
Gave thanks: 7
Thanked 28 times
hm, you want it to search cpanel files/directories? dont think that is recommended, or even useful in any way?
__________________
Geoff Ellis - Surpass Dedicated Server Customer
www.adepttechs.net
Geoff is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old January 27th, 2008, 1:23 PM   #3 (permalink)
Registered User
Fresh Surpasser
 
Joined in Mar 2007
21 posts
Gave thanks: 3
Thanked 0 times
Question

Hi Geoff,

thank you for the reply.

My need is to offer a search facility on my website permitting searching on file content.

My website contain a document repository (pdf, ppt, doc, xls..) and my visitors when they lanch a search request with keywords will have a result if there keywords are on the file title or on the file description (stored in mysql tables).In the current situation if the keywords exists on the file content and don't exist on the title or on the description they will not have a positive result.
You may agree that is not the best way.

in my recent research i have found 2 solutions :
  1. converting all my files (pdf, ppt, doc, xls..) to xml format and populate them on mysql database to be searchable.
  2. or installing a search index (like microsoft index called catalog or swish-e or glimpse) in order to index all the files content stored in a specified directory not all ones in my server and be able to return a positive result when the keyword is indexed.
As i readed recently, glimpse can be added as an add on cpanel and swish is using a perl api called by php so i may added in the perl module so both are normally feasible !

I am asking to have a feedback from surmunity member if someones have implemented some of these solutions, how to implement them or is there another better way to do what i want.

Thanks a lot for any help.

Last edited by mirco; January 27th, 2008 at 1:31 PM. Reason: correction
mirco is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old January 27th, 2008, 1:25 PM   #4 (permalink)
Skittles
Super #1
 
DewKnight's Avatar
 
Joined in Aug 2004
Lives in Space ship
Hosted on dedi
6,480 posts
Gave thanks: 91
Thanked 176 times
I've used this on a site: http://www.xav.com/scripts/search/

Not sure if it's what you're really looking for, but I know you can set it up to read pdf files (with a plugin). I think you can set it up to read office files as well
__________________
Mountain Dew Knight
People should not be afraid of their governments. Governments should be afraid of their people.
DewKnight is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old January 28th, 2008, 4:21 AM   #5 (permalink)
Yabadabadoo
Super #1
 
Geoff's Avatar
 
Joined in Nov 2004
Lives in B.C., Canada
Hosted on Dedicated
1,011 posts
Gave thanks: 7
Thanked 28 times
okay, i dont know anything about Swish-E

but there is a feature in cpanel, if enabled, allows use of some free cgi scripts. one called entropy search. If its enabled, you should be able to see it listed under

http://yourdomain:2082/frontend/x3/cgi/index.html (provided you are using cpanel11 with x3 theme) or simply look for "cgi center" under your cpanel account

anyways, that might do what you need? dont know how efficient it is with joomla repositories and pdf/etc files.
__________________
Geoff Ellis - Surpass Dedicated Server Customer
www.adepttechs.net
Geoff is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old January 28th, 2008, 12:08 PM   #6 (permalink)
Registered User
Fresh Surpasser
 
Joined in Mar 2007
21 posts
Gave thanks: 3
Thanked 0 times
convert (doc, pdf, ppt, xls, zip..) to html ?

Hi guys,

Thanks a lot for your response, they are very interesting.

I have taken a look on both solution : http://www.xav.com/scripts/search/ seems to search only on plain text , html, or pdf. (needs more configuration to be installed and the support is minimum)

for entropy search search only on HTML, PERL, plain text files (which it seems to be more simple to install on cpanel)

For both solution, i need another facility to convert my (doc, pdf, xls, zip,) files on html version that will allow search indexing and improve SEO to all the website because the search engine will be more efficient to index the content of the html version not only the description or a zip format.

So i think that is more appropriate that i choose entropy search because it's more easy to implement and i will be able to have a support by surpass to cpanel.

Do anyone recommend a facility to convert (doc, pdf, ppt, xls, zip,) files format to html ?

I really appreciate your help.
mirco is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On