Welcome Visitor, Please Login or Register Now Thursday, April 25, 2024 07:22 PM 
Forums Index > Mewsoft > Official Announcements > RSS Box Released Free Download. RSS Feed Display Boxes On Your Site
New Topic   New Reply
Search for:
Author Message
mewsoft mewsoft's personal page
Administrator Team Member
Posts: 5381 Display member's posts
Joined: 04-25-24 07:22 PM
Location: knxoville, TN, USA
Member Offline
View Member's Profile Visit member's website Mewsoft Corporation Send private message ICQ Messenger: ICQ AIM Messenger: AIM YIM Messenger: Yahoo MSN Messenger: MSN
 
Back to top
RSS Box Released Free Download. RSS Feed Display Boxes On Your Site Edit Delete Reply with quote Quote
First Post Posted on: 11-19-12 04:02 PM next post first post

Dear Everyone,

Today we released our RSS Feeds parser and display script. It is GPL totally 100% free to download and use. Please download it from our download page or from the link below:

http://www.mewsoft.com/download/rssbox.zip

The program is  single perl script cgi file and does not needs any other modules or files, it is standalone RSS parser, web agent and display results with different themes.

Please download the file rssbox.zip, unzip it, you will find 2 files rssbox.cgi and a sample examples file rssbox.html.

The program usage is very simple, upload the file rssbox.cgi to your cgi-bin folder or any other folder you want, change its permissions to 755 or executables, make sure it is working by pointing your browser to it like this:

http://www.mewsoft.com/cgi-bin/rssbox.cgi

You should get the default Mewsoft Forums News since no target rss link supplied.

Now to display a specific new or rss feed on your pages, all you need is to call it like this:

http://www.mewsoft.com/cgi-bin/rssbox.cgi?http://rss.cnn.com/rss/cnn_tech.rss

You can customize the layout of each output box.
You can have any number of boxes on the same page and each box has its own customization and setting.

Open the file rssbox.cgi with any text editor like notepad yu will find how you can customize the boxes layout and set different themes. You display each box with a different theme by specifiy the them=number in the url , for example:

http://www.mewsoft.com/cgi-bin/rssbox.cgi?http://rss.cnn.com/rss/cnn_tech.rss&theme=4

The above link will display the feed using the theme settings number four in the file rssbox.cgi which you can edit to your needs, you can have unlimited number of theme where for each theme you can clip title, descriptions, limit the number of items to display, you can strip html tags from the description,

The themes array in the file looks like this:

my @Theme = (
        { #theme=0
         limit       => 0,
         max_title     => 0,
         title_more    => '...',
         strip_html    => 0,
         max_description  => 0,
         description_more => '...'
        },

        { #theme=1
         limit       => 10,
         max_title     => 50,
         title_more    => '...',
         strip_html    => 1,
         max_description  => 200,
         description_more => '...'
        },
        { #theme=2
         limit       => 5,
         max_title     => 0,
         title_more    => '...',
         strip_html    => 0,
         max_description  => 0,
         description_more => '...'
        },
        { #theme=3
         limit       => 0,
         max_title     => 0,
         title_more    => '...',
         strip_html    => 1,
         max_description  => 0,
         description_more => '...'
        },
        { #theme=4
         limit       => 0,
         max_title     => 70,
         title_more    => '...',
         strip_html    => 0,
         max_description  => 0,
         description_more => '...'
        },
        { #theme=5
         limit       => 0,
         max_title     => 70,
         title_more    => '...',
         strip_html    => 0,
         max_description  => 0,
         description_more => '...'
        },
        { #theme=6
         limit       => 0,
         max_title     => 70,
         title_more    => '...',
         strip_html    => 0,
         max_description  => 0,
         description_more => '...'
        },
        { #theme=7
         limit       => 0,
         max_title     => 70,
         title_more    => '...',
         strip_html    => 0,
         max_description  => 0,
         description_more => '...'
        },
        { #theme=8
         limit       => 0,
         max_title     => 70,
         title_more    => '...',
         strip_html    => 0,
         max_description  => 0,
         description_more => '...'
        },
        { #theme=9
         limit       => 0,
         max_title     => 70,
         title_more    => '...',
         strip_html    => 0,
         max_description  => 0,
         description_more => '...'
        },
        { #theme=10
         limit       => 0,
         max_title     => 70,
         title_more    => '...',
         strip_html    => 0,
         max_description  => 0,
         description_more => '...'
        },
    wink;

You can also include any of these rss fields in your display:

# < !--Link-->    : the link to the rss article or item
# < !--Title-->    : the rss item title
# < !--Description--> : the full item description
# < !--pubDate-->  : the item publication date
# < !--Author-->   : the author of the item if exists
# < !--ID-->     : unique id number generated for each item


 

Mewsoft RSS Box Samples Demos

 

Yahoo Tech News
BBC News
http://mewsoft.com/cgi-bin/rssbox.cgi?http://rss.news.yahoo.com/rss/tech&theme=3 http://mewsoft.com/cgi-bin/rssbox.cgi?http://newsrss.bbc.co.uk/rss/newsonline_uk_edition/front_page/rss.xml&theme=3
CNN World News
NY Times News
http://mewsoft.com/cgi-bin/rssbox.cgi?http://rss.cnn.com/rss/cnn_world.rss&theme=2 http://mewsoft.com/cgi-bin/rssbox.cgi?http://www.nytimes.com/services/xml/rss/nyt/Business.xml&theme=5
Mewsoft News
Yahoo Tech News
http://mewsoft.com/cgi-bin/rssbox.cgi?http://www.mewsoft.com/cgi-bin/forum/forum.cgi?action=RSS&Forum=9,15,13,23,1,6&Limit=10
http://mewsoft.com/cgi-bin/rssbox.cgi?http://rss.news.yahoo.com/rss/tech&theme=3


Mewsoft Support
www.mewsoft.com

mewsoft mewsoft's personal page
Administrator Team Member
Posts: 5381 Display member's posts
Joined: 04-25-24 07:22 PM
Location: knxoville, TN, USA
Member Offline
View Member's Profile Visit member's website Mewsoft Corporation Send private message ICQ Messenger: ICQ AIM Messenger: AIM YIM Messenger: Yahoo MSN Messenger: MSN
 
Back to top
Re: RSS Box Released Free Download. RSS Feed Display Boxes On Your Site Edit Delete Reply with quote Quote
Reply #: 1 Posted on: 11-19-12 04:07 PM last post previous post


rssbox.cgi Perl source code


Code: Perl    Select All    Expand All
#!/usr/bin/perl
#!C:\perl\bin\perl.exe
=Copyright Infomation
==========================================================
    Program Name    : Mewsoft RSS Box
    Program Author   : Dr. Elsheshtawy, Ahmed Amin, Ph.D. Physics
    Home Page          : http://www.mewsoft.com
   Email                    : support@mewsoft.com
   Products               : Auction, Classifieds, Directory, Forums, etc. Integrated
   Copyrights © 2008 Mewsoft® Corporation. All rights reserved.
==========================================================
This program is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License
as published by the Free Software Foundation; either version 2
of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program; if notwrite to the Free Software
Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA  02111-1307, USA.
==========================================================
=cut
#==========================================================
$| = 1;
print "Content-type: text/html\n\n";
use CGI::Carp qw(fatalsToBrowser)
my $VERSION = "0.50";
#==========================================================
#==========================================================

#==========================================================
#         default rss feed link
#==========================================================
#the $default_rss link is used to grab the rss contents if no one is specified
#in the url query, for example:
#   http://mewsoft.com/cgi-bin/rssbox.cgi?http://www.nytimes.com/services/xml/rss/nyt/Business.xml
$default_rss = 'http://www.mewsoft.com/cgi-bin/forum/forum.cgi?action=RSS&Forum=9,15,13,23,1,6&Limit=25';
#==========================================================
#            Theme customization settings
#==========================================================
# The array below @Theme contains different configuration theme settngs,
#using a theme number starting from 0, 1, 2, etc, you can select the settings
# to display your feeds, you can edit the array values and add more theme
# as you like, to call a specific theme with specific rss link, see this eample:

#   http://mewsoft.com/cgi-bin/rssbox.cgi?http://rss.news.yahoo.com/rss/tech&theme=1
# the above example uses the settings in the theme array entry number 1.
# to use the default theme, theme=0, no need to specify it.

# Theme settings details
#limit:
#set to 0 to display all defult feeds, anything else to limit the number of items.
#max_title:
#maximum length of the title, 0: is the full length
#title_more:
#set to anything if you want to show the title is clipped.
#strip_html:
#set 1 if you want to strip all html code from the description.
#max_description: 
#set to zero "0" to show the full description, or set to the max.
#allowed, if this variable set, the variable $strip_html will be also automaticaly set.

#description_more: set to anything if you want to show the description is clipped.

my @Theme = (
                        { #theme=0
                           limit                     => 0,
                           max_title               => 0,
                           title_more            => '...',
                           strip_html            => 0,
                           max_description      => 0,
                           description_more   => '...'
                        },

                        { #theme=1
                           limit                     => 10,
                           max_title               => 50,
                           title_more            => '...',
                           strip_html            => 1,
                           max_description      => 200,
                           description_more   => '...'
                        },
                        { #theme=2
                           limit                     => 5,
                           max_title               => 0,
                           title_more            => '...',
                           strip_html            => 0,
                           max_description      => 0,
                           description_more   => '...'
                        },
                        { #theme=3
                           limit                     => 0,
                           max_title               => 0,
                           title_more            => '...',
                           strip_html            => 1,
                           max_description      => 0,
                           description_more   => '...'
                        },
                        { #theme=4
                           limit                     => 0,
                           max_title               => 70,
                           title_more            => '...',
                           strip_html            => 0,
                           max_description      => 0,
                           description_more   => '...'
                        },
                        { #theme=5
                           limit                     => 0,
                           max_title               => 70,
                           title_more            => '...',
                           strip_html            => 0,
                           max_description      => 0,
                           description_more   => '...'
                        },
                        { #theme=6
                           limit                     => 0,
                           max_title               => 70,
                           title_more            => '...',
                           strip_html            => 0,
                           max_description      => 0,
                           description_more   => '...'
                        },
                        { #theme=7
                           limit                     => 0,
                           max_title               => 70,
                           title_more            => '...',
                           strip_html            => 0,
                           max_description      => 0,
                           description_more   => '...'
                        },
                        { #theme=8
                           limit                     => 0,
                           max_title               => 70,
                           title_more            => '...',
                           strip_html            => 0,
                           max_description      => 0,
                           description_more   => '...'
                        },
                        { #theme=9
                           limit                     => 0,
                           max_title               => 70,
                           title_more            => '...',
                           strip_html            => 0,
                           max_description      => 0,
                           description_more   => '...'
                        },
                        { #theme=10
                           limit                     => 0,
                           max_title               => 70,
                           title_more            => '...',
                           strip_html            => 0,
                           max_description      => 0,
                           description_more   => '...'
                        },
            );
#==========================================================
my $content;
my $row;
#==========================================================
#The $content variable holds the html template code for output, you can 
#customize the html code as you like to your own needs.
#You must include the parsed rss output by inserting the code <!--Output--> 
#anywhere in the html code of the $content.
#you can customize everything below between these 2 lines:
#$content =<<content;
#   ....
#content;
#==========================================================
$content =<<content;

<script type="text/javascript">
function Toggle(id) {  
   
   HideOthers(id);
   var ctrlID = '';
   if (document.all) {ctrlID = document.all(id);}
   else   if (document.getElementById) {ctrlID = document.getElementById(id);}

   if (ctrlID.style.display == 'none'){ctrlID.style.display = 'block';   }
   else   {ctrlID.style.display = 'none';}
}

function HideAll(){
   var tags = document.getElementsByTagName("div")
   for(i=0;i<tags.length;i++){
      tags[i].style.display = 'none'
   }
}

function HideOthers(divID){
   var tags = document.getElementsByTagName("div")
   var re = new RegExp('^item','i'); //item1, item2, ..., item99999
   for(i=0;i<tags.length;i++){
      if (tags[i].getAttribute('id') != null){
         if (tags[i].getAttribute('id') != divID){
            if (re.test(tags[i].getAttribute('id'))) {tags[i].style.display = 'none';}
            //.getAttribute('id').match('item')
            //if(tags[i].getAttribute('id').indexOf('item') != -1){tags[i].style.display = 'none';}
         }
      }
   }
}
</script>

<table border="0"  width="100%" cellspacing="0" cellpadding="2" style="border-collapse: collapse">
   <tr>
      <td nowrap align="center" width="100%">
            <!--Output-->
      </td>
   </tr>
</table>
content

#==========================================================
# Below is the rss display table row, you can customize to your needs.
#These fields can be used inside your html code to display rss fields:
#   <!--Link-->            : the link to the rss article or item
#   <!--Title-->            : the rss item title
#   <!--Description-->   : the full item description
#   <!--pubDate-->      : the item publication date
#   <!--Author-->         : the author of the item if exists
#   <!--ID-->               : unique id number generated for each item
#you can customize everything below between these 2 lines:
#$content_row=<<content_row;
#   ....
#content_row;
#==========================================================
$content_row=<<content_row;
<tr><td width="100%">
   <a href="javascript:Toggle('item<!--ID-->');"><!--Title--></a><br />
   <div id="item<!--ID-->" name="item<!--ID-->" style="display: none; overflow:auto;background-color: #F4F4F4; color: #000000;padding: 0px 2px 2px 2px; margin: 0px 3px 3px 3px; border: 1px #C0C0C0 groove;">
      <div id="desc<!--ID-->" style="width:100%; height: 20px;background-color: #E1E1E1;">
         &nbsp;<a href="<!--Link-->" target="_blank">Read more...</a><br />
      </div>
      <!--Description-->
   </div>
</td></tr>
content_row
#==========================================================
#          No need to edit anything below unless you know what you are doing
#==========================================================
   my $buffer;
   if ($ENV{CONTENT_LENGTH} ) {
            read(STDIN, $buffer$ENV{CONTENT_LENGTH});
    } 
   elsif ($ENV{QUERY_STRING}) {
            $buffer = $ENV{QUERY_STRING};
    }
   
   my $rsslink;
   #($rsslink, undef) = split(/\&/, $buffer); 
   $rsslink = $buffer;
   my $theme = 0;
   if ($buffer =~ /&theme=(\d?)/i) {
         $theme = int($1if ($1);
         $theme += 0;
         if ($theme < 0) {$theme = 0;}
         $rsslink =~ s/&theme=(\d*)//;
   }
   $rsslink =~ tr/+/ /;
   $rsslink =~ s/%([A-Fa-f0-9]{2})/pack("C", hex($1))/ge;
   $rsslink =~ s/\r|\n|\cM$//g;
#==========================================================
   #open F, "forum.xml" || die("Failed to open input file feed_sample.xml\n");
   #my $rss = join "", <F>;
   #close F;
   #$rsslink = 'http://www.mewsoft.com/cgi-bin/forum/forum.cgi?action=RSS&Limit=20';
   #$rsslink = 'http://newsrss.bbc.co.uk/rss/newsonline_uk_edition/front_page/rss.xml';
   #$rsslink = 'http://www.nytimes.com/services/xml/rss/nyt/Business.xml';
   #$rsslink = 'http://rss.cnn.com/rss/cnn_world.rss';
   #$rsslink = 'http://rss.news.yahoo.com/rss/tech';
   $rsslink ||= $default_rss;
   #my ($tree, @items) = RssParser::parse_xml($rss);
   my ($tree@items) = RssParser::parse_url($rsslink);
#==========================================================
#==========================================================
   my $output = "";
   my $counter = 0;
   if (@items) {
      foreach my $item (@items) {
            $counter++;
            if ($Theme[$theme]->{limit} > 0 && $counter > $Theme[$theme]->{limit}) {last;}
            my $line = $content_row;
            $line =~ s/<!--Link-->/$item->{link}/g;
   
            my $title = $item->{title};
            if ($Theme[$theme]->{max_title} > 0 && length($title) > $Theme[$theme]->{max_title}) {
               $title = substr($title1,$Theme[$theme]->{max_title}) . $Theme[$theme]->{title_more} ;
            }
            $line =~ s/<!--Title-->/$title/g;
            #------------------------------------------
            my $description = $item->{description};
            if ($strip_html) {
                  $description =~ s/<[^>]+>//ig;
            }
            if ($Theme[$theme]->{max_description} > 0 && length($description) > $Theme[$theme]->{max_description}) {
                  $description =~ s/<[^>]+>//ig;
                  $description = substr($description1$Theme[$theme]->{max_description}) . $Theme[$theme]->{description_more};
            }
            $line =~ s/<!--Description-->/$description/g;
            
            $item->{pubdate} ||= "&nbsp;";
            $line =~ s/<!--pubDate-->/$item->{pubdate}/g;
            $item->{'dc:creator'} ||= $item->{author};
            $item->{'dc:creator'} ||= "&nbsp;";
            $line =~ s/<!--Author-->/$item->{'dc:creator'}/g;
            $line =~ s/<!--ID-->/$counter/g;
            $output .= $line;
      }
   }
   $content =~ s/<!--Output-->/$output/g;
   print $content;
   exit 0;
#==========================================================
#==========================================================
package RssParser;
#original code by Eli Billauer, rewritten by Mewsoft

BEGIN{

use strict;
use warnings;
use Exporter;

our @ISA = qw(Exporter);
our @EXPORT = qw (&parse_xml &parse_url &disptree);

our $VERSION = '0.5';

our %htmlescapes = (
          'quot' => 34,
          'amp' => 38,
          'apos' => 39,
          'lt' => 60,
          'gt' => 62,
          'nbsp' => 32# Was 160, but we make it a normal space
          'iexcl' => 161,
          'cent' => 162,
          'pound' => 163,
          'curren' => 164,
          'yen' => 165,
          'brvbar' => 166,
          'sect' => 167,
          'uml' => 168,
          'copy' => 169,
          'ordf' => 170,
          'laquo' => 171,
          'not' => 172,
          'shy' => 173,
          'reg' => 174,
          'macr' => 175,
          'deg' => 176,
          'plusmn' => 177,
          'sup2' => 178,
          'sup3' => 179,
          'acute' => 180,
          'micro' => 181,
          'para' => 182,
          'middot' => 183,
          'cedil' => 184,
          'sup1' => 185,
          'ordm' => 186,
          'raquo' => 187,
          'frac14' => 188,
          'frac12' => 189,
          'frac34' => 190,
          'iquest' => 191,
          'agrave' => 192,
          'aacute' => 193,
          'acirc' => 194,
          'atilde' => 195,
          'auml' => 196,
          'aring' => 197,
          'aelig' => 198,
          'ccedil' => 199,
          'egrave' => 200,
          'eacute' => 201,
          'ecirc' => 202,
          'euml' => 203,
          'igrave' => 204,
          'iacute' => 205,
          'icirc' => 206,
          'iuml' => 207,
          'eth' => 208,
          'ntilde' => 209,
          'ograve' => 210,
          'oacute' => 211,
          'ocirc' => 212,
          'otilde' => 213,
          'ouml' => 214,
          'times' => 215,
          'oslash' => 216,
          'ugrave' => 217,
          'uacute' => 218,
          'ucirc' => 219,
          'uuml' => 220,
          'yacute' => 221,
          'thorn' => 222,
          'szlig' => 223,
          'agrave' => 224,
          'aacute' => 225,
          'acirc' => 226,
          'atilde' => 227,
          'auml' => 228,
          'aring' => 229,
          'aelig' => 230,
          'ccedil' => 231,
          'egrave' => 232,
          'eacute' => 233,
          'ecirc' => 234,
          'euml' => 235,
          'igrave' => 236,
          'iacute' => 237,
          'icirc' => 238,
          'iuml' => 239,
          'eth' => 240,
          'ntilde' => 241,
          'ograve' => 242,
          'oacute' => 243,
          'ocirc' => 244,
          'otilde' => 245,
          'ouml' => 246,
          'divide' => 247,
          'oslash' => 248,
          'ugrave' => 249,
          'uacute' => 250,
          'ucirc' => 251,
          'uuml' => 252,
          'yacute' => 253,
          'thorn' => 254,
          'yuml' => 255
);

# These are typical HTML tags, which should be omitted.

our %ignore_tags = (
          'img'     => 1,
          'a'       => 1,
          'p'       => 1,
          'br'      => 1,
          'div'     => 1,
          'span'    => 1,
          'b'       => 1,
          'i'       => 1,
          'u'       => 1,
          'body'    => 1,
          'center'  => 1,
          'code'    => 1,
          'font'    => 1,
          'form'    => 1,
          'h1'      => 1,
          'h2'      => 1,
          'h3'      => 1,
          'h4'      => 1,
          'head'    => 1,
          'hr'      => 1,
          'html'    => 1,
          'li'      => 1,
          'ul'      => 1,
          'ol'      => 1,
          'pre'     => 1,
          'style'   => 1,
          'sub'     => 1,
          'sup'     => 1,
          'script'  => 1,
          'small'   => 1,
          'big'     => 1,
          'table'   => 1,
          'td'      => 1,
          'tr'      => 1,
          'th'      => 1,
          'textarea'=> 1,
          'strong'  => 1,
          'strike'  => 1,
          'blockquote' => 1,
         );

our %namespace = (
      # RSS 2.0 tags
      'xml'          => 'xml',
      'rss'          => 'rss',
      'rdf'          => 'rdf',
      'item'         => 'item',
      'channel'      => 'channel',
      'image'        => 'image',
      'title'        => 'title',
      'link'         => 'link',
      'description'  => 'description',
           'language'     => 'language',
      'copyright'    => 'copyright',
      'pubdate'      => 'pubdate',
      'lastbuilddate'=> 'lastbuilddate',
      'category'     => 'category',
      'generator'    => 'generator',
      'ttl'          => 'ttl',
      'url'          => 'url',
      'width'        => 'width',
      'height'       => 'height',
      'version'      => 'version',
      'encoding'     => 'encoding',
      'guid'         => 'guid',
      'enclosure'    => 'enclosure',
      #Mewsoft added
      'dc:creator'    => 'dc:creator',
      'author'         => 'author',

      # RSS 1.0 tags translated to RSS 2.0
      'subject'      => 'category',
      'rights'       => 'copyright',
      'modified'     => 'lastbuilddate',
      'date'         => 'pubdate',
      'resource'     => 'resource'# 1.0 specific!

      # Atom 1.0 tags translated to RSS 2.0
      'feed'      => 'channel',
      'summary'   => 'description',
      'content'   => 'description',
      'subtitle'  => 'description',
      'lang'      => 'language',
      'published' => 'pubdate',
      'updated'   => 'lastbuilddate',
      'logo'      => 'image',
      'entry'     => 'item',
      'href'      => 'link',
     );

Note that %specials refer to the *right* side of %ns, so only one
# entry is needed for each functional tag or its alias

# TRUE means array type
our %specials = (
       'item' => 1,
       'channel' => 1,
       'image' => 1,
       'xml' => 0,
       'rss' => 0,
       'rdf' => 0,
      );
}
#==========================================================

#==========================================================
sub parse_url {
my ($link) = @_;
   
   $link || return (undefundef);
   eval "use LWP::UserAgent;";
   my $ua = new LWP::UserAgent;
   $ua->agent('Mewsoft Agent www.mewsoft.com/3.0');
   $ua->timeout(180);
   my $request = new HTTP::Request(GET=> $link);
   my $response = $ua->request($request);
   if ($response->is_success) {
        return parse_xml($response->content);
   }
   else {
         return (undef$response->status_line); 
   }
   
}
#==========================================================
sub parse_xml {
my ($in$debug) = @_;

  $in =~ s/<!--.*?-->//gs# Remove comments
  my @segs = map { /^[ \n\r\t]*(.*?)[ \n\r\t]*$/s } ($in =~ /(<!\[CDATA\[.*?\]\]>|<[^>]+?>|[^<]+)/gs);

  # Strip off CDATAs. Added a prefix space to avoid accidental tag hits
  @segs = map { /^<!\[CDATA\[(.*?)\]\]>$/s ? " $1" : $_ } @segs;
  @segs = grep { length > 0 } @segs;
  
  my @stack = ();
  my @valstack = ();
  my %tree = ();
  my $here = \%tree;
  my @parent = ();
  my $lastval = "";
  
  foreach my $elem (@segs) {
      my ($modifier$tag$attr$empty) = ($elem =~ /^<([!?\#]{0,1})[ \n\r\t]*([^ \n\r\t]*[^ \/\n\r\t])[ \n\r\t]*(.*?)[ \n\r\t]*(\/{0,1})>$/s);

      $empty = 1 if ($modifier); 

      if (defined $tag) {
               $tag = lc $tag;      # We're case-insensitive

               Note that the regex below removes "dc:"-like namespace prefices
               my $closing;
               #($closing, $tag) = ($tag =~ /^(\/{0,1}).*?:{0,1}([^:]*)$/);
               ($closing$tag) = ($tag =~ /^(\/{0,1})(.*?)$/);

               if ($ignore_tags{$tag}) {
                  htmltags($here, unescape($elem));
                  next;
               }
            
               # Opening tags...
               unless ($closing) {
                  push @stack$tag;

                  my $alias = $namespace{$tag};

                  if (defined $alias) {
                     push @valstack$lastval;
                     $lastval = "";

                     if (defined $specials{$alias}) {
                        push @parent$here;
                        $here = {};       
                     }

                     Note that attributes may pollute the parent hash. This is
                     # necessary to support Atom 1.0

                     my @pairs = ($attr =~ /([^ \n\r\t]+?=\'[^\']*?\'|[^ \n\r\t]+?=\"[^\"]*?\"|[^ \n\r\t]+?=[^ \n\r\t]*)/g);

                     foreach my $p (@pairs) {
                           my ($k$v) = ($p =~ /(.+?)=(.*)/);
                           $k = lc $k;
                           $v = $1 if (($v =~ /^\'(.*)\'$/s) || ($v =~ /^\"(.*)\"$/s));    
                           ($k) = ($k =~ /([^:]*)$/); # Remove namespace prefix if present
                           my $alias1 = $namespace{$k};
                           if (defined $alias1) {
                              $here->{$alias1} = unescape($v);
                           }
                           else {
                              warn "Ignored attribute $k=$v\n" if $debug;
                           }

                     }#foreach my $p (@pairs) {
                  } #if (defined $alias) {
                  else {
                        warn "Ignored tag $tag\n" if $debug;
                  }#if (defined $alias) {
            }#unless ($closing) {
      
            # Closing tags, or close an empty opening tag
            if ($closing || $empty) {
                  my $p = pop @stack;
                  return "Bad XML tag nesting. Expected end tag for '$p', got '/$tag'" unless ($p eq $tag);
                  my $alias = $namespace{$tag};
                  if (defined $alias) {
                        my $thislastval = $lastval;
                        $lastval = pop @valstack;
                        if (defined $specials{$alias}) {
                                 my $parent = pop @parent;
                                 if ($specials{$alias}) { # Array type
                                       $parent->{$alias} = []  unless ((ref $parent->{$alias}) && (ref $parent->{$alias}) eq 'ARRAY');
                                       push @{$parent->{$alias}}, $here;
                                 } else {
                                       $parent->{$alias} = $here;
                                 }

                                 #$here->{'description'} =~ s/(<.*?>)/htmltags($here, $1)/ges if (defined $here->{'description'});
                                 $here = $parent;   
                        } #if (defined $specials{$alias}) {
                        else {
                                 $here->{$alias} = unescape($thislastvalunless ((length($thislastval) == 0) && 
                                                               (defined $here->{$alias}) && 
                                                               (length $here->{$alias}));       
                        }#if (defined $specials{$alias}) {
                  }#if (defined $alias) {
            }#if ($closing || $empty) {
      } #if (defined $tag) {
      else 
      {
            $lastval = (length $lastval) ? "$lastval $elem" : $elem;    
      }#if (defined $tag) {
   } #foreach my $elem (@segs) {

   return("Bad XML nesting: There were unclosed tags at EOF"if (@stack);
   #------------------------------------------------------
   #------------------------------------------------------
   my $filtertags = "item";
   my @items = items(\%tree$filtertags);
   return \%tree@items;
   #return wantarray ? @items : \%tree;
}
#==========================================================
sub items {
my ($what$filtertags) = @_;
my (@items);

   foreach my $k (sort keys %{$what}) {
         my $v = $what->{$k};
         if ((ref $veq 'HASH') {
            push @items, items($v$filtertags);
            next;
         }

         if ((ref $veq 'ARRAY') {
               my $count;
               for ($count=0; $count<=$#{$v}; $count++) {
                     if (lc($keq $filtertags) {
                           foreach my $i ($v->[$count]) {
                              push @items$i;
                           }
                     }
                     push @items, items($v->[$count], $filtertags);
               }
               next;
         }
   }
   return @items;
}
#==========================================================
sub htmltags {
my ($here$seg) = @_;

   my ($tag$attr) = ($seg =~ /^<[ \n\r\t]*([^ \n\r\t]+)[ \n\r\t]*(.*?)[ \n\r\t]*>$/s);

   return "" unless (defined $tag);

   $tag = lc $tag;

   # Respect HTML line breaks, even though the renderer won't
   return "\n" if (($tag eq 'p') || ($tag eq 'br'));

   if (($tag eq 'img') && !(defined $here->{'altimage'})) {
      my $new = {};
      $here->{'altimage'} = $new;
      $here = $new;
   } 
   elsif (($tag eq 'a') && !(defined $here->{'altlink'})) {
      my $new = {};
      $here->{'altlink'} = $new;
      $here = $new;
   } else {
      return "";
   }
      
   my @pairs = ($attr =~ /([^ \n\r\t]+?=\'[^\']*?\'|[^ \n\r\t]+?=\"[^\"]*?\"|[^ \n\r\t]+?=[^ \n\r\t]*)/g);
  
   foreach my $p (@pairs) {       
      my ($k$v) = ($p =~ /(.+?)=(.*)/);
      $k = lc $k;
      $v = $1 
      if (($v =~ /^\'(.*)\'$/s) || ($v =~ /^\"(.*)\"$/s));    
      $here->{$k} = $v;
   }

   return ""# This makes the function useful in substitutions
}
#==========================================================
sub disptree {
my ($what$s) = @_;

   foreach my $k (sort keys %{$what}) {
         my $v = $what->{$k};

         if ((ref $veq 'HASH') {
            print " "x$s."$k\n";
            disptree($v$s+2);
            next;
         }

         if ((ref $veq 'ARRAY') {
            my $count;
            for ($count=0; $count<=$#{$v}; $count++) {
                  print " "x$s.$k."[$count]\n";
                  disptree($v->[$count], $s+2);
            }
            next;
         }
         print " "x$s."$k => $v\n";    
   }
}
#==========================================================
sub single_unescape {
my ($ent) = @_;
   my $ord = $htmlescapes{lc($ent)};
   return chr($ordif defined $ord;
   return ""# Conversion failed, return nothing
}
#==========================================================
sub unescape {
Note! Unicode characters are escaped to space!
my ($x) = @_;

   # For now, we go wild, and convert all escape markers
   # Run twice, because of double-nested markups :-O
   for (my $i=0; $i<2$i++) {
      $x =~ s/&(\w+);/single_unescape($1)/ge;
      $x =~ s/&\#(\d+);/chr($1 < 256 ? $1 : 32)/ge;
      $x =~ s/&\#x([0-9a-fA-F]+);/chr(hex($1) < 256 ? hex($1) : 32)/ige;
   }
   return $x;
}
#==========================================================

1;



Mewsoft Support
www.mewsoft.com

Page 1 of 1
Go to page:

New Topic   New Reply Mark Unread
Jump to:  
Delete   Move     Lock   +Favorits   +Notify   Print