[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: parsing markup with Perl

  • From: Shlomi Fish <shlomif@shlomifish.org>
  • To: Michael Sokolov <msokolov@safaribooksonline.com>
  • Date: Mon, 10 Feb 2014 10:06:50 +0200

Re:  parsing markup with Perl
Hi Michael,

I'm joining this thread late.

First of all I'd like to note that I found the original article (maybe it's
more of a blog post) on O'ReillyNet, which I only skimmed, to be mostly
factual, and correct and I'm saying this as an active Perl programmer and
contributor to the Perl ecosystem (including the Perl core and
CPAN - https://metacpan.org/author/SHLOMIF ) who knows a little about the
proper way to do things and been helping or trying to help people with their
Perl problems on various Perl-specific forums.

Otherwise, I believe your message was information and insightful. Some nitpicks.

On Sun, 09 Feb 2014 07:54:24 -0500
Michael Sokolov <msokolov@safaribooksonline.com> wrote:

> On 2/8/14 4:08 PM, Arjun Ray wrote:
> > But why blame Perl, other than it's the language eveyone loves to
> > hate?  (Actually, I thought that would be C++.)
> >

There is “There are only two kinds of languages: the ones people
[constantly] complain about and the ones nobody uses.” (--
https://en.wikiquote.org/wiki/Bjarne_Stroustrup ). Also see what I wrote about
it here - http://shlomif-tech.livejournal.com/57811.html (“Your Programming
Language Must [expletive deleted]”) and there’s also the Worse-is-better conjecture -
https://en.wikipedia.org/wiki/Worse_is_better .

> >
> I think it's because regexes (*real* perl regexes, not the pale 
> imitations purists refer to as "regular expressions") are so easy to 
> use, convenient, and powerful in Perl, unlike other languages (cough, 
> Java I'm looking at you).   

Well, the term regular expression has its origins in theoretical computer
science - see http://mathworld.wolfram.com/RegularExpression.html which are
limited in their scope and functionality and can be translated to and from
deterministic finite automata (DFAs or “state machines”). However, after Ken
Thompson imported them for useful functionality in ed and later on tools like
sed, grep, vi, etc. they were extended with additional syntax and
functionality. Moreover, the POSIX regex syntax (e.g: “grep -E”) requires
such features as back references (e.g: (...)...\1 ), which are not supported
by mathematical regular expressions.

Perl 1→2→3→4→5→→→5.18.0 extended the syntax for regular expressions considerably
with many of its extensions, and other programming langauges created similar
grammars for regular expressions with their own extensions, missing features,
etc. On #perl on the Freenode IRC (= online chat) network, people often join
#perl to ask us questions for these “Like-Perl-But-Different” (LPBD - an
unflattering play on the name of the
https://en.wikipedia.org/wiki/Perl_Compatible_Regular_Expressions library ) and
it annoys some regulars, though we sometimes help them with their issues.

And after the fact many languages copied the Perl 5 regular expressions, notes
that:

«
Now, the thing I really wanted to get to, which we think will influence the
world outside of Perl is we saw how everyone borrowed Perl 5 compatible regular
expressions, and we figured - well, you know, they're a really big mess, and
we're sorry, but we're changing them now, now that you've just borrowed them.

So everyone is gonna go from PCRE to P6RE or something like that.
»

That was the intention back then, but the project to create Perl 6, has made
some releases, but they are still not fast and bug-free enough for production
use. I know that Perl 6 patterns including the entire
https://en.wikipedia.org/wiki/Formal_language -s feature set, which is a
superset of the languages supported by mathematical regexes/state machines.

So the times are indeed "interesting" (Whether they are good is a different
issue).

To sum up, Perl 5 does not have a sole claim on what "regular expression" means
and implies and it is not the origin of the term or the programming
paradigm, though naturally it has influenced its evolution and mutation a lot.

Regarding Perl 5, I still happen to use it a lot in part because it's my
comfort zone and I'm used to it. I've taken a look at Python, Ruby and other
languages and also wrote some code with them, but I didn't find them
earth-shattering - I found them better in some respects than Perl 5, but also
worse in many others, and they weren't as enlightening as learning Perl 5 and
UNIX after knowing DOS, Windows 3.x, BASIC, QBasic, C, and C/C++ around 1996 (I
actually felt it was an "epiphany"). I agree Perl 5 has many years (possibly
decades) of legacy and I have written a page with many items about what you can
do wrong with it (which the interpreter and compiler often won't complain
about) and what should be done instead -
http://perl-begin.org/tutorials/bad-elements/ (although a lot of it is not
specific to Perl).

(I should note I’m not entirely blasé because I had a real joy discovering such
useful tools as https://en.wikipedia.org/wiki/VLC_media_player and
http://emscripten.org/ ).

On https://en.wikiquote.org/wiki/Alan_Kay there’s a quote that say:
         
<<<
    Perl is another example of filling a tiny, short-term need, and then being
a real problem in the longer term. Basically, a lot of the problems that
computing has had in the last 25 years comes from systems where the designers
were trying to fix some short-term thing and didn’t think about whether the
idea would scale if it were adopted. There should be a half-life on software so
old software just melts away over 10 or 15 years. 
>>>

On the other hand “Aristotle” gives a somewhat opposing but insightful view
here:

http://blogs.perl.org/users/aristotle/2013/06/decade-scale.html

he compares backward compatibility to gravity, saying that without it we'll
just float around in space and that the evolution of carbon-based life on Earth,
still required nature to have gravity.

I could go on, but I want to say that there is a lot of cruft and legacy in
Perl, and it [expletive deleted] quite a bit, but I guess it's still usable with some
discipline. It's become an Internet sport to hate and criticise Perl, but: 1.
It's possible to write bad code in any language. 2. Many other popular
languages have other bad aspects, and you can never please everyone. This talk
which I was referred to shows some design aspects of Scala that the lecturer
(and I) found wrong - http://www.youtube.com/watch?v=TS1lpKBMkgg (apologies if
you are hearing and/or sight impaired, there may be some transcript or subtitles
somewhere), and Scala is a young language which could have been
designed properly. 

Regards,

	Shlomi Fish

-- 
-----------------------------------------------------------------
Shlomi Fish       http://www.shlomifish.org/
The Human Hacking Field Guide - http://shlom.in/hhfg

Major Kira: Well, duh! The prophets spoke of several of The Invisible’s
children and their whereabouts.
Jadzia: Major, you and I will need to talk someday about that “duh”.
    — http://www.shlomifish.org/humour/Star-Trek/We-the-Living-Dead/

Please reply to list if it's a mailing list post - http://shlom.in/reply .


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.