Work From Home |
|
AUTOMATEDSITEMAP GENERATORSearch Engine OptomisationOne
of the most highly written about ways of improving your website especially
getting on search engines, improving search engine ranking and getting
your pages indexed etc is by creating a sitemap that search engines can
read. These days it's often in the form of a small XML script containing
the directory and paths to your web pages together with a list of the
webpages to
index. This is particularly true for the Google search engine. Google and Yahoo have
gone to the trouble of creating sites to submit your sitemap so you may as
well take advantage of them.
They
are called Google Accounts and Site Explorer
respectively.
There
are a number of online sitemap generators available. All you do is copy
and paste what these generators produce in a XML file and upload to the
appropriate directory on your server.
This is all very well and good, but you can do
without the chore of of going to these website generators or modifying by
hand the XML file everytime you make modifications, updates or create new
web pages on your site. After all, another of the search engine
optimisation techniques is to update regularly!
You should be concentrating on the content rich websites your
making, not spending all your time on website housekeeping.
All you need is
the ability to set up Cron Jobs on your server, the ability to upload to
the appropriate directory, a simple text editor like notepad on Windows
and this PHP script. Set it up once, then forget. It's that simple!
An example of a script to place in your text editor, including advice on what to modify to suit
your website, is as follows:
<?php
// Build an xml
sitemap from existing folders using this sitemap_genie from
domain.e-pond.info.
// After
modifying, upload the Script to your Websites Root
Folder.
// Run the Script
from the Command Line -> php sitemap_genie.php or through cron jobs
from the control panel
// If you cant,
you may try to run it from your
Browser.
// You may have to set Permissions to the Script and
also Folders depending upon how your site is
designed.
// The Script will output sitemap.xml Files
$base_path =
"/home/yourserverloginusername/public_html/";
// Base Path for
your Website, this is the Server Path.
// Change yourserverloginusername to the username you
use to access your server. If you are creating a sitemap for what's in the
root directory use public_html, otherwise swap public_html for the
directory name of the subdomain or directory.
$website_url =
http://domain.e-pond.info;
// Swap our website url for yours.
$change_frequency
= "daily";
// How often are your Websites updated ? Swap the word daily above with your
preference.
// Choices to use
are:
// always ; hourly ; daily ;
weekly ; monthly ; yearly ; never
$priority =
0.5;
// This indicates
the Priority of your Pages.
// 0.5 is the
Default, values range from 0.0 to
1.0
// This script allows you to assign one
Priority Value to all Pages.
$gm_diff =
"+00:00";
// The difference in Hours from your Timezone to Greenwich
Mean Time.
$web_files =
array("php"=>"1", "htm"=>"1",
"html"=>"1");
// These are the files Extension which will be included to
the Sitemap. Modify as required.
$exclude =
array("my_private_file.html"=>"1",
"my_private_folder"=>"1");
// These are the folders or files to be exluded. The
above my_private_folder is the style for directories. My_private_file.html
is an example of excluding a webpage from the list. Modify to suit.
$page_name =
"sitemap_genie.php";
// Name for this
File, if you change the Filename please update this to reflect the new
Name.
// end setup.
// Leave the script below alone.
print "start building an xml sitemap with path =
$base_path - website url =
$website_url\n";
$folder_count =
0;
$wu_count = 0;
$sitemap_size =
0;
$sitemap_wu_count =
0;
$sitemap_count =
1;
function open_map ($open_path,
$open_count) {
$write_handle =
fopen($open_path."/sitemap.xml", "w") or
die();
fputs($write_handle, "<?xml
version='1.0'
encoding='UTF-8'?>\n");
fputs($write_handle, "<urlset
xmlns=\"\n'>\n'"
designtimesp=19555>http://www.google.com/schemas/sitemap/0.84\">\n");
return
$write_handle;
}
$write_handle =
open_map($base_path,
$sitemap_count);
$walk_folders = array("/");
while ( $walk_folders ) {
$shift_folder =
array_shift($walk_folders) or
die();
$actual_folder =
$base_path.$shift_folder;
print "actual working folder =
$actual_folder<p>\n\n";
if ( is_dir($actual_folder) ) {
if ( $dh = opendir($actual_folder)
) {
while ( $file = readdir($dh)
) {
print "file ->
$file<br>\n";
if ( ($file != ".")
&&
($file != "..") &&
($file != $page_name)
&&
(! isset($exclude[$file]))
&&
(file_exists($actual_folder.$file)) ) {
if (
is_dir($actual_folder.$file) ) {
print
"file $file is directory, include to walk through
folders<br>\n";
$folder_count++;
$walk_folders[] =
$shift_folder.$file."/";
}
else {
$web_files_match = array();
preg_match("/\..+?$/", $file, $web_files_match);
$wfm =
preg_replace("/^\./", "", $web_files_match[0]) or
die();
$wfm =
strtolower($wfm);
if (
isset( $web_files[$wfm] ) ) {
$last_mod = date("Y-m-d H:i:s",
filemtime($actual_folder.$file));
$lm_array = explode(" ",
$last_mod);
$last_mod = $lm_array[0]."T".$lm_array[1].$gm_diff;
$put_url =
$website_url.$shift_folder.$file;
$put_url = preg_replace('#&#', '&',
$put_url);
$put_url = preg_replace('#\'#', ''',
$put_url);
$put_url = preg_replace('#"#', '"',
$put_url);
$put_url = preg_replace('#>#', '>',
$put_url);
$put_url = preg_replace('#<#', '<', $put_url);
//
print web file to sitemap.
$wf_put = "
<url>\n";
$wf_put .= "
<loc>".$put_url."</loc>\n";
$wf_put .= "
<lastmod>".$last_mod."</lastmod>\n";
$wf_put .= "
<changefreq>".$change_frequency."</changefreq>\n";
$wf_put .= "
<priority>".$priority."</priority>\n";
$wf_put .= " </url>\n";
print "put url $put_url to sitemap number
$sitemap_count<br>\n";
$sitemap_size +=
strlen($wf_put);
$sitemap_wu_count++;
$wu_count++;
if (
($sitemap_wu_count < 45001 )
&&
($sitemap_size < 9000000 ) ) {
fputs($write_handle, $wf_put) or
die();
}
else
{
fputs($write_handle, "</urlset>") or
die();
fclose($write_handle) or die ();
$sitemap_count++;
$sitemap_wu_count = 0;
$sitemap_size = 0;
$write_handle = open_map($base_path,
$sitemap_count);
}
}
}
}
}
}
closedir($dh);
}
}
fputs($write_handle,
"</urlset>") or die();
fclose($write_handle) or die();
print "end
building sitemap -><br>\ntravelled $folder_count folder/s
-><br>\ngot $wu_count URL's in the sitemap/s\n";
exit;
?>
Save
the file as sitemap_genie.php or the name you modified in the script
above.
Upload to your website. For example create
the directory sitemapgenie and save this php file into it.
To
set up a cron job in the control panel use the following example for the
command:
/usr/local/bin/php -f
home/yourseverusername/public_html/sitemapgenie/sitemap_genie.php
Then just set up the frequency
of performing the con job and it's set to go.
More from
domain.e-pond.info:
|