SSEP - BUT Cannot save its Page Data

Place for comments, problems, questions, or any issue related to the JavaScript / PHP scripts from this site.
rshweb
Posts: 11

SSEP - BUT Cannot save its Page Data

Every time I try to index the site it comes back with this message for some of the pages:

Code: Select all

BUT Cannot save its Page Data
It indexed 323 pages but says it cannot index 123 pages
Any ideas?

MarPlo Posts: 186
I think those pages contain some characters that cannot be saved in database.
Can you post the url address of some of that pages?

rshweb Posts: 11
Yes here are a few:

Code: Select all

rshweb.com/website-maintenance
rshweb.com/web-site-hosting
rshweb.com/cpanel-beginners-guide

MarPlo Posts: 186
I tested on localhost (with XAMPP) your website with SSEP crawl and index application.
As you can see in the image bellow, almost all the pages are indexed, except 3 pages for "Cannot get Page data - Status: 404".
The application works, I don't know what the problem is, maybe somthing from your server.

- Try reindex your website (Check the Reindex button).
- If you still get those errors, I can give you the database with all the pages I indexed, so you can import and use it on your server.

screen_sh-rshweb_com

rshweb Posts: 11
Yes I just tried to re-index and got the same result
"Without indexed content - 123"

I even tried deleting and cleaning, the reindexing and get the same
Have one of the Tech guys looking into it
Any suggestions what he could look for?
And if you could send me a copy that could work too
Good script thanks!!

Edit:
Nothing is working no matter what we do on our end
Can we enable some debugging information about why it is returning these unexpected results

MarPlo Posts: 186
To test manully some of those addresses, try with "Manual Indexing".
Add in textarea some of those url addresses with error (one per line).
Also, you can use a sitemap in xml format. SSEP can generate a sitemap with the indexed url addresses.

I sent you an email (to the address you registered here) with an attachment with sitemap of your site (sitemap.xml) and a sql file with ssep tables with the indexed pages of your website (rshweb.sql).
First, manually delete all the ssep_ tables from your database, then import the rshweb.sql file.

rshweb Posts: 11
Well no matter what I do I get the same results
New data base, same results
Even tried setting it upon a different site, same results
Tried from mine and your site map, same results
Tried importing your sql file, seemed like it did not find it or see it
Is here any way to enable logging?
Or any other ideas?

MarPlo Posts: 186
That error appears when data is not inserted in mysql table, but the database returns no error.

The SSEP script has no logging system.
Try check the logs with errors from server hosting.

I not have other ideas. I tested the script on two servers: one on localhost (with XAMPP), second on the server where coursesweb.net is hosted. In both cases your website was successfully indexed.

If you want to see if its something from your server, try the script on localhost (on your PC) with a server application, like XAMPP.

I not know how to help you in this case because I don't get that error in indexing a website.

The sql file I have sent to you was exported with PhpMyAdmin, I not know way it not find or see it.
You have to extract it from the zip archive, then import the sql from cPanel, or with PhpMyAdmin.

rshweb Posts: 11
Well your not going to believe this one
Its some of my "meta_description" being to long
I shorten a few and they now index
Is their a way to change the script so as not to have this problem?

Edit:
I did find where you had the disruption limit to 150
Changed it to 190 seemed to help a lot
Also went through all the descriptions making sure they are all under 190
But still have 9 pages not being indexed
Just cant see a problem with these
This is one not being indexed

rshweb.com/blog-secure-wordpress-hosting
Registered - Depth: 2 - Time: 0.014312 - Size: 31.71 KB; BUT Cannot save its Page Data

MarPlo Posts: 186
I made some changes in the SSEP script:
1. Maximum number of character for Title in database, 120.
2. Maximum number of character for Description in database, 190.
3. Special characters from Title and Description (like @`,."\|?<>) are removed before adding in database.
4. If the number of characters in Title and Description exceed maximum allowed size, it is extracted only the allowed number of characters to be inserted in database.

Download again and reinstall the script from: coursesweb.net/php-mysql/ssep-site-search-engine-php-ajax_s2

rshweb Posts: 11
Yes that helped greatly
not indexing content from 7 pages now
Here are 2 of the pages

Code: Select all

rshweb.com/wordpress-favicon
rshweb.com/blog-configserver-security-firewall
Let me know if you see anything
Again THANKS!!
Richard

MarPlo Posts: 186
In my tests those pages are indexed. I'm not sure what the problem is; maybe the way your database is set regarding the characters type (utf-8, Ascii, general_ci or other type).
Anyway, not sure if that is the cause; in the "wordpress-favicon" page I saw less usual double-quotes (between brackets), try to remove them:

Code: Select all

content="A favicon (short for “favorite icon”) is an small image generally intended to be used when you bookmark a web page"
- But in my database that page was indexed.
So I guess that database doesn't allow certain characters; but maybe I'm wrong because those are already in the database for page content.
Anyway, it's good that 99.5% has been resolved.