JCrawler Forum

  • About
  • JCrawler - 1.7 Beta Joomla 1.5
  • JCrawler forum

  • Home
  • JCrawler
  • JCrawler Forum

JCrawler Forum

Help forums for JCrawler

Register or log in (lost password?):

JCrawler Forum

» Problems

blank loc for some links

(13 posts)
  • Started 1 year ago by amackie
  • Latest reply from Rogier
  • Related Topics:
    1. blank page after crawling
    2. blank page
    3. There are 3 links in your sitemap.
    4. There are 1 links in your sitemap
    5. Jcrawler 1.7 Undefine Index: out_links

Tags:

  • blank loc
  1. amackie
    Member

    My generated sitemap has empty loc tags in the middle of the document and some page links are missing from the links that are displayed. Otherwise it works fine - the links before and after the blank section are good.

    The blanks look like this:

    <url>
    <loc></loc>
    <lastmod>2009-02-17T10:15:50Z</lastmod>
    <priority>0.5</priority>
    <changefreq>daily</changefreq>
    </url>

    Any ideas? (And thanks for the great component.)

    Posted 1 year ago #
  2. mohanarun
    Member

    Hi same problem here - other URLs have loc some urls not having loc but during generation it generates these curl errors and I have curl installed and properly working till now

    Curl error on url http://www.nimltdengineering.com/smoking-shelters/free-standing-smoking-shelter-1x25m.html: couldn\'t connect to host
    Curl error on url http://www.nimltdengineering.com/smoking-shelters/free-standing-smoking-shelter-2x25m.html: couldn\'t connect to host
    Curl error on url http://www.nimltdengineering.com/smoking-shelters/free-standing-smoking-shelter-3x25m.html: couldn\'t connect to host
    Curl error on url http://www.nimltdengineering.com/smoking-shelters/free-standing-smoking-shelter-4x25m.html: couldn\'t connect to host
    Curl error on url http://www.nimltdengineering.com/smoking-shelters/free-standing-smoking-shelter-9-x-25m.html: couldn\'t connect to host
    Curl error on url http://www.nimltdengineering.com/smoking-shelters/premium-smoking-shelter-1x25m.html: couldn\'t connect to host
    Curl error on url http://www.nimltdengineering.com/smoking-shelters/premium-smoking-shelter-2x25m.html: couldn\'t connect to host
    Curl error on url http://www.nimltdengineering.com/smoking-shelters/premium-smoking-shelter-3x25m.html: couldn\'t connect to host
    Curl error on url http://www.nimltdengineering.com/smoking-shelters/premium-smoking-shelter-4x25m.html: couldn\'t connect to host

    Posted 1 year ago #
  3. BHorst
    Member

    No errors on generation, but the same empty loc\'s...

    <url>
    <loc></loc>
    <lastmod>2009-02-18T10:53:18Z</lastmod>
    <priority>0.5</priority>
    <changefreq>daily</changefreq>
    </url>

    Very good component!! Thanks a lot!

    Posted 1 year ago #
  4. BHorst
    Member

    I changed the File /administrator/components/com_jcrawler/admin.jcrawler.php, to have no longer empty locs:

    Find in line 423: \"foreach ($urls as $loc){\"

    Delete the line with \"i++;\"

    Add \"if (strlen(trim($loc[\'url\'])) > 0){ i++;\" after \"$loc[\'url\']=htmlspecialchars($loc[\'url\']);\"

    Add \"}\" after \"</url>\";\"

    When you\'re finished it should look like this:

    foreach ($urls as $loc){

    /* urf-8 encoding */
    //$loc=htmlentities($loc,ENT_QUOTES,\'UTF-8\');
    //$loc=htmlspecialchars($loc,ENT_QUOTES,\'UTF-8\',false);
    $loc[\'url\']=htmlspecialchars($loc[\'url\']);

    if (strlen(trim($loc[\'url\'])) > 0){ $i++;

    $modified_at = date(\'Y-m-d\\Th:i:s\\Z\');
    $xml_string .= \"
    <url>
    <loc>\".$loc[\'url\'].\"</loc>
    <lastmod>$modified_at</lastmod>
    <priority>\".$priority.\"</priority>
    <changefreq>\".$freq.\"</changefreq>
    </url>\";
    }
    }

    Now no more empty loc\'s

    Posted 1 year ago #
  5. lumo
    Member

    Well done BHorst, this fixed my blank loc problem,
    Thanks

    Posted 1 year ago #
  6. Jalby
    Member

    This definitely cleared up my missing locs, the sitemap validates and I saved 300 MB of error log! Thanks mate. Does this now constitute 1.7 Beta?

    Posted 1 year ago #
  7. patrick
    Key Master

    thanks, yeah thats a fix!
    but i found the true problem, that leads to the empty loc\'s and fixed it already!

    greets Patrick

    Posted 1 year ago #
  8. jerren
    Member

    What\'s the official problem/fix Patrik? I\'m seeing the same problem.

    Posted 1 year ago #
  9. FSGDAG
    Member

    I\'ve made the changes above, but now I\'m getting an error when trying to run jcrawler.

    Here is my code:

    _____________
    foreach ($urls as $loc){

    /* urf-8 encoding */
    //$loc=htmlentities($loc,ENT_QUOTES,\\\'UTF-8\\\');
    //$loc=htmlspecialchars($loc,ENT_QUOTES,\\\'UTF-8\\\',false);
    $loc[\\\'url\\\']=htmlspecialchars($loc[\\\'url\\\']);

    if (strlen(trim($loc[\\\'url\\\'])) > 0){ $i++;

    $modified_at = date(\\\'Y-m-d\\\\Th:i:s\\\\Z\\\');
    $xml_string .= \\\"
    <url>
    <loc>\\\".$loc[\\\'url\\\'].\\\"</loc>
    <lastmod>$modified_at</lastmod>
    <priority>\\\".$priority.\\\"</priority>
    <changefreq>\\\".$freq.\\\"</changefreq>
    </url>\\\";
    }
    }

    _________________

    And here is the error messages I\'m getting when trying to run jcrawler:

    Warning: Unexpected character in input: \'\\\' (ASCII=92) state=1 in /home/xxxx/public_html/administrator/components/com_jcrawler/admin.jcrawler.php on line 428

    Parse error: syntax error, unexpected T_STRING, expecting \']\' in /home/xxxx/public_html/administrator/components/com_jcrawler/admin.jcrawler.php on line 527

    Posted 1 year ago #
  10. FSGDAG
    Member

    Forget my post above... :) I found what I was doing wrong and fixed it :) No more blank lines :)

    Posted 1 year ago #
  11. Rogier
    Member

    Tried to use that fix, but:

    Warning: Unexpected character in input: \'\\\' (ASCII=92) state=1 in /home/********/domains/********/public_html/administrator/components/com_jcrawler/admin.jcrawler.php on line 429

    Parse error: syntax error, unexpected T_STRING, expecting \']\' in /home/********/domains/********/public_html/administrator/components/com_jcrawler/admin.jcrawler.php on line 527

    Can somebody tell what goes wrong on line 429 and 527?

    Posted 1 year ago #
  12. Rogier
    Member

    This is the corrected code: (line 423)

    foreach ($urls as $loc){

    /* urf-8 encoding */
    //$loc=htmlentities($loc,ENT_QUOTES,\'UTF-8\');
    //$loc=htmlspecialchars($loc,ENT_QUOTES,\'UTF-8\',false);
    $loc[\'url\']=htmlspecialchars($loc[\'url\']);

    if (strlen(trim($loc[\'url\'])) > 0){ $i++;

    $modified_at = date(\'Y-m-d\\Th:i:s\\Z\');
    $xml_string .= \"
    <url>
    <loc>\".$loc[\'url\'].\"</loc>
    <lastmod>$modified_at</lastmod>
    <priority>\".$priority.\"</priority>
    <changefreq>\".$freq.\"</changefreq>
    </url>\";
    }
    }

    Posted 1 year ago #
  13. Rogier
    Member

    Fixed Component:
    http://www.pixelschieber.ch/forum/topic/jcrawler-16b-beta#post-172

    Posted 1 year ago #

RSS feed for this topic

Reply

You must log in to post.

Pages

  • About
  • JCrawler - 1.7 Beta Joomla 1.5
  • JCrawler forum

Socialising

  • Facebook
  • Last fm

Webdesign

  • Cool Webdesigner

Recent Comments

  • patrick on Joomla 1.5 - SEO - Tipps und Tricks
  • Pablo on Circuit de Chenevières Jacques Cornu
  • ledzep on Joomla 1.5 - SEO - Tipps und Tricks
  • David on Joomla 1.5 - SEO - Tipps und Tricks
  • Sanakirja on Validation - Smile

Categories

Tags

Amerika anneau du rhin Audi Bridgestone cornu CYMK der standard Detail Druck Druckerei Erwartung Exportieren Farbkomponenten Film Frankreich Hammer Humor InDesign Ironman jacques Kinostreich Marvell maschine Modus Mopped PDF pdf standards Photoshop pixelschieber pixler Plan Racingtag reifen Rennstrecke Runde Schade Sponsorentasche Supersportler support masters Tiefsinn tipps Trailer Visitenkarte Vorlage Yamaha R1

Recent Posts

  • Circuit de Chenevières Jacques Cornu
  • TicTac Spot
  • Validation - Smile
  • James Bond 007: Quantum of Solace - Ein Quantum Trost - Filmkritik
  • Joomla 1.5 - SEO - Tipps und Tricks

Last referers

  • - http://qo(...)emap.html
  • - http://hu(...)t-porn-81
  • - http://oo(...)emap.html
  • - http://ub(...)online-35
  • - http://no(...)emap.html

Top Browsers

  • - IE 6
  • - Firefox 3
  • - IE 7
  • - IE 5
  • - Opera 9

Top OS

  • - WinXP
  • - WinVista
  • - Win2008
  • - Win2000
  • - MacOSX

Visitors Online

  • 01 visitor(s) online
  • powered by WassUp

JCrawler Forum is proudly powered by bbPress.