[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: bad code Re: Subject: ChatGPT results are "subject

Subject: Re: bad code Re: Subject: ChatGPT results are "subject to review"
From: "Piez, Wendell A. (Fed) wendell.piez@xxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Fri, 7 Jul 2023 16:13:08 -0000
Re:  bad code Re: Subject: ChatGPT results are "subject
Gentle readers (which includes you, John Lumley :-),

The XSL-List archives are undoubtedly already in the training set for the big
LLMs. All our old code there is what Dorothy is seeing, regurgitated. (From
where else could they have got it?) They should be paying us royalties.

Indeed how well one of these does on an XML or XSLT task is a direct
reflection of how that task is covered in the archives as well as Reddit,
StackOverflow and the open forums altogether. This is easy enough to see if
you switch topics to something even more obscure.

Or if you ask it to go 'meta' and say, tell you things such as who contributes
to the open forums and lists, and what they say - something it will presumably
fabulate as cheerfully as it does about anything, until it's told it
shouldn't. Think about this for a second. This is about the erosion of trust
that Mike K (was it?) noted. Dorothy, you may consider yourself a mid-level
programmer but tell ChatGPT that you are the best, and it will not disagree.

I am not sure this will mean that we can't trust 'facts' any more. But we will
have to be much more intentional about what sources we rely on and how the
integrity of those sources can be guarded. "Fake people" and fake information
about real people are indeed actual, real risks, much more than bad XSLT that
won't actually be deployed, much.

Bringing it back on topic: doesn't the existence of 'confected code', like
applications based on generated code (hat tip to Roger in other thread), more
or less mean we have to come back to unit tests, in order to demonstrate, not
merely claim, the correctness and viability of processes?

And isn't 'confected code' already a problem, even if LLM code-assistance
makes it worse (or better)?

Cheers, Wendell

From: John Lumley john.lumley@xxxxxxxxxxxxxx
Sent: Friday, July 7, 2023 9:43 AM
To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
Subject: Re:  bad code Re: Subject: ChatGPT results are "subject to

Perhaps more importantly, I assume there is no way we can prevent
aforementioned hazard from using the XSLT-list as training data? Having made
some contributions I in no way wish those to be used/mangled by a glorified
deep pattern-matcher. Such a pity that knowledge-based programming didn't get
really pushed much further in the early 90s...
John Lumley

On 7 Jul 2023, at 14:35, Dave Pawson
rytech.com>> wrote:
Which begs the question, how might the xsl-list archives be ...
declared / converted / made available (whatever) as training data?
 And for this set (minor drawback), how to extract the 'eventual'
solution from others proffered in error?
XSL-List info and archive<http://www.mulberrytech.com/xsl/xsl-list>
EasyUnsubscribe<http://lists.mulberrytech.com/unsub/xsl-list/3302254> (by

Current Thread


Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
First Name
Last Name
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.