×
INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Students Click Here

XMLToCursor Parse Error
3

XMLToCursor Parse Error

XMLToCursor Parse Error

(OP)
My application interrogates a WEB Based Database and returns e.g. <fname> David </fname>

The following code has been working ok until recently I can across a <fname> . </fname>.

CODE -->

XMLToCursor(QRZ_Lookup,"cur_QRZ_Lookup")
Contacts_name = StrExtract(qrz_lookup,"<fname>","</fname>") + " " + StrExtract(qrz_lookup,"<name>","</name>") 

The "period" in the First Name produced the following error.



What would I need to do to my code to prevent this error occurring for any invalid values?

Regards,

David.

Recreational user of VFP.

RE: XMLToCursor Parse Error

David,

Why are hyou doing both XMLTOCURSOR() and parsing the XML with STREXTRACT()? If your only aim is to extract the first and last name, then STREXTRACT() would do it by itself - and would not produce that error.

Of course that doesn't apply if you need the cursor for other purposes. In that case, could you STREXTRACT() the name first, then programmatically delete it from the XML (using STRTRAN() perhaps), then convert the remaining XML to a cursor?

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: XMLToCursor Parse Error

That can't be the XML you gave there, that's perhaps what you got in the cursor after the XMLToCursor and it would already show that XMLToCursor can't parse it fully to a simple cursor structure. _Because no XML should appear in a field if the XML is a table structure.

It also isn't always a table structure. XML, in general, is a nested hierarchy of feely chosable tags (eXtensible markup, not just aa defined set of tags, any tags) and only some of them are having the schema of a table, field tags nested in row tags nested in a table (or cursor) tag, which has to be the tag embedded in the outmost XML tag enclosin it all. If all these nestings are hidden inside further levels you might get multiple cursors from it, but also not with XMLToCursor, that requires other handling.

Extremely simplified, the guaranteed subset of XML working and what XMLTOCursor is meant to be able to parse is what you generated with CursworToXML.

The more complex way to consume XML Web services goes through further classes and a WSDL service definition, so you're just lucky when a simple XMLToCursor works and gives you something you can then finally extract with STREXTRACT. But - as said - that's already a sign the XML wasn't meant to be parsed into a cursor format. the XML you got in one field is rather an object having a property with multiple child properties, a 1:nm hierarchy.

So you have to understand XML is much more arbitrary to have any kind of simple of complex nested nodes in it, it's not just another table format.

The most general thing you can do is put XMLToCursor into TRY..CATCH to make use of this intended interface of this function to either work and give you a cursor or raise an error in case the parsing finds a violation of DT D or schema. And that may both not exist in the XML or be referred to when more general rules and an inferred (guessed) schema play a role.

But there's no option like ignoring and skipping unallowed characters. So in the end all you can do in such cases is adapt code to the situation and in defense of periods in the XML first read it whole via FILETOSTR((), remove periods, and save it back for parsing ( r parse from the memory variable).

Since you can't foresee what parsing problems you get there is no general way of handling it. In the simplest case just fail to create a cursor and retrieve the data. Log the case for later analysis, see what's not working, and extend your code.

Bye, Olaf.

Olaf Doschke Software Engineering
https://www.doschke.name

RE: XMLToCursor Parse Error

(OP)
Hello Mike,

Thank you for your reply, much appreciated.

Quote (Mike Lewis)

Why are you doing both XMLTOCURSOR() and parsing the XML with STREXTRACT()?

It's been a few years since I wrote this particular code; the only answer I can give is that maybe my thoughts were that I should download the Data in XML and then store it in a Cursor to enable the information to be accessed from various modules within my application.

Quote:

If your only aim is to extract the first and last name, then STREXTRACT() would do it by itself - and would not produce that error.

I wrote simple program to get the XML Data and just used STREXTRACT() and this worked ok as you suggested.

How would I make the XML Data available to other modules without using a Cursor?

Regards,

David.

Recreational user of VFP.

RE: XMLToCursor Parse Error

By the way, there are other parsers. That will also not convert any XML to cursors, but that's just the nature of XML. As said it's not just a table format,

So the simplest way to make XML available to any module is as the XML string it is. Of course, you can never get closer to what XML is by the XML itself. And let the individual modules have to pick out the portion they need, for example (oversimplified XML):

CODE --> XML

<XML>
<table1>
...table 1 rows...
</table1>
<table2>
...table 2 rows...
</table2>
</xml> 

Obviously STREXTRACT can be ideally used to court out the section of tblae 1 with STREXTRACT(cXML,'<table1>','</table1>') to get just the portion

CODE --> XML

<table1>
...table 1 rows...
</table1> 
Depending on parameterization with or without start and end tag.

If you can manage to get such XML portions that actually are a table structure, you can also put back XML on it and let XMLToCursor make a cursor from that.

Maybe look into turning XML to an object instead, or to other alternatives. wwXML, for example, was written even before VFP7 got minimal XML support and already does more than VFP now. Other VFP XML implementations exist, there are a few on GitHub.

Bye, Olaf.

Olaf Doschke Software Engineering
https://www.doschke.name

RE: XMLToCursor Parse Error

(OP)

Quote (Olaf)

By the way, there are other parsers. That will also not convert any XML to cursors, but that's just the nature of XML.

Hello Olaf,

Thank you for your replies, much appropriated.

I will take on board your comments and will carry out a few more tests to see which option will best suit my needs.

Regards,

David.

Recreational user of VFP.

RE: XMLToCursor Parse Error

Just by the way, as the most general way of XML is a treeview of nodes named by the tags, there also is a more general way to turn the XML text into an object, and that's why I recommended that. But's it also has its limits.

Even a simple example can show you the limits:

CODE

<XML>
<family>
<father>Dad</father>
<mother>Mom</mother>
<daughter attribute="first born">Alice</daughter>
<daughter>Heidi</daughter>
</family>
</xml> 

What you could have in VFP is o^Family with properties Ffther, mother, and daughter and then you don't only fail on the second daughter needing either another property of the same name or turning a simple property to an array, you also have no way to store more than a value to a property, the daughter property can't have another property called attribute where "first born" os stored.
Oviously one way to cope with that is turn any name into an object in the first place, that can be extended with subobjects.

But then the problem becomes how do you finally end this in simple properties with a value? Let all objects have a value property like controls have? What if value is the name of an XML node?

So you see, in the end you need to know in advance what structure you expect. And you do, you expect one node named fname. Like you design a person table storing a name in fname, lname to have these field and then rely on them existing.

The other thing to see from this is that you could even expect several ways of this to turn into a table.
A table called family with field father, mother, daugher1 and daughter2, disregarding the tribute. A table family with person records and one table persontype being father, mother each once and twice daughter, with a name column storing the name and an attribute columns storing the extra information from there.

There is no standard way to turn XML into anything but strictly speaking the XML it already is, You have to know the structure of the XML for a certain method called from the Webservice and parse it accordingly. Just like you know in advance what fields a SQL result has.

Bye, Olaf.

Olaf Doschke Software Engineering
https://www.doschke.name

RE: XMLToCursor Parse Error

(OP)
Having checked through my VFP/XML code it would appear that I had based the code on the contents of this thread: - https://www.tek-tips.com/viewthread.cfm?qid=1695937

My application has a "Data Entry Form" which is used to record "Contacts" in a MySQL Database. When the Form is run, it checks this database to see if the contact already exists, if it does, the Forms Fields are populated with the relevant data. Any Blank Fields or if the contacts details don't exist they are populated by the WEB Based Database Details if they exist.

Data entered on the "Data Entry Form" is then checked for any obvious errors before saving to the Database.

My code appears to use a mixture of "cur_QRZ_Lookup" and "StrExtract(qrz_lookup,"<name>","</name>")". I think my best option would be to make use of StrExtract instead of using cur_QRZ_Lookup.

Thank you both for your advice much appreciated.

Regards,

David.

Recreational user of VFP.

RE: XMLToCursor Parse Error

Quote:

I think my best option would be to make use of StrExtract instead of using cur_QRZ_Lookup.

That looks like the best solution (for the reasons I explained in my first post in this thread) - and also the simplest.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: XMLToCursor Parse Error

Because I said about the XML coming from "http://www.treasury.gov/ofac/downloads/sdn.xml"

Quote (myself)

I can convert it to one cursor with XMLToCursor

The method to load XML into an XMLAdapter class is the more general way to even be able to detect multiple tables in the XML. But then the contradiction of XMLToCursor pulling in all data, not just the first 10% of the XML and the iteration on XMLAdapters.Tables collection.

Again, XML is not just a tool to store table data. It's far generally hierarchical data and it even tends to nest details with the parent nodes instead of separating that into parent/child table with relation, the relation is being nested into a parent node, it's like subobjects with a parent object are pointing to it simply by a reference pointer, not a foreign key/parent key ID pair. So the concepts of XML and databases are not really a match.

In that case, XMLAdapter fails to see many more tables by misjudging the structure. But the better analysis of the problem is in what I said later:

Quote (myself)

The nature of the XML is to nest joined data into the records XML elements of the main table sndList, this why you can't parse out or convert the single tables, you have to post process the one cursor you get to extract the records in a "normalized" way. It is your start point for your postprocessing unless you find empty list fields, where there should be detail-data. But I can't judge that for you.

I can't judge, so how should code be able to judge? This requires internal knowledge about the meaning. You have to know about what the XML means, how it was constructed and what relations the nodes have, that's not self-descriptive. What's technically clear about this snd.xml is that it contains one sndList, that's the start and end tag. From that it's already wrong to see 10 tables, but then it is actually so that this one list contains many sections, which XMLToCursor interprets as fields, and loads these XML sections.

Both tools XMLAdapter and XMLToCursor are not AI, very mechanical and so not perfect. StrExtract also isn't, but for StrExtract any surrounding context doesn't matter. As long as you're sure the XML you receive is about one contact and the first name is in some <fname> tag, you can use StrExtract(), though. That makes it more stable for this task. Once this becomes XML about multiple persons you'll need to first see how to separate them and then find their own <fname> tag. So once you know some XML is about one record and you seek out one field of it, it's very natural to use StrExtract and not care about the XML more than perhaps seeing what kind of codepage conversion you need to do about the extracted text. But that's also no final truth.

It's just like a query to a database, you can only make sense of a database or a subset taken from it by knowing what relations exist, what tables and fields mean, the semantic isn't in the structure alone. That's what I continually say.

So to write correct code about some XML Web Service you have to know the method, it's result XML on top of the Soap protocol part that's automatically handled for you when using WSDL. In this case that's not the case, though, it's simply a huge XML file from the start. And for example firefox tells "No stylesheet information are linked to this XML (so no XMLSN, no XML schema, no DTD." which would at least be a technical readable information about the structure, still not the semantics. And Firefox then says "The following shows the tree-view of the document", and that tells what's the only general truth about XML, that it is a hierarchical tree of nodes, always.

It boils down to needing to initially know how this tree is structured on an abstract level, what you have to expect in nodes, what repeats how, is nested into which other node, just like you need to know table and field names and relations of tables to query them with sensible queries. So XML always has to come with how to interpret it or you can only fall back to its hierarchival structure, nodes are named brackets containing (nesting) something.

You find better possibilities to deal with XML in other languages. I mean, you can blame this to the parsers in MSXML3 and 4, but that's also what VFP XML functions are always stuck with. The Webbrowser control in contrast, for example, always uses the browser canvas of the most current IE Version installed. The VFP team decided to stick with specific versions to be able to ensure the usage works forever. And it's also somewhat okay, because you don't gain better XML interpretations by later MSXML versions, the problem of XML not transporting its own meaning is inherent, SQL database tables also don't tell how they are meant, there's just more meta info about relationships, for example, when the database designer puts it in, at least.

The nesting of XML inside its nodes is only a sign of a general relationship of a certain XML node structure to a parent node, when it repeats the same way, so each single nesting doesn't establish a general relationship, just the relationship of the child node XML to this exact one parent node. In database sense that's means generally unstructured data, JSON is the same, just less complex than XML. There only is an overlap of these worlds as XML is indeed also used to make more restricted use of them. And ever so often people chunking out some XML do nobody a favor by not documenting what it is, neither with a technical schema and even less so with human readable just natural text descriptions of what this is about. Just like a class library needs a documentation of what to use which methods for to make good use of it and the only chance to dive into a library use is you already know about the topic of it, have the right expectations about how it works.

So just don't look for the simple recipe you can always use. This is always individually different.

You can find XML structures that programmers use more often than just what would be possible randomy, because you want to use XML to structure data, obviously. You can easily categorize XML in one further simple way: With or without repeating inner structures. Those identify to you whether this is about a record or a head structure. Configurtion XML consisting of options and values will usually have no repeating nodes, it may be as simple as a single record of a number of fields named by the option containing the setting value. It may also be seen as n records of key name and value and interpreted as key/value store. And it may be that one of the options has a nested substrucutre and so it neither fits key&/value nor a record. XML in the end is more like a complex object with subobjects with the catch-22 I mentioned about even that not working for repeated nodes within the same parent node. Because a hierarchy can do things tables can't. Unlike rows of a table, child nodes don't need to care about being the same composition of nodes, even when the are nested with the same type of parent node as another group od child nodes. If you convert a table into XML that is what you get, but you don't always have that and acn't always convert XML into a row structure. And then, last not least, the way we database people see a tree usually with a self referencing table or perhaps a hierarchy of tables is also not something that needs to fit on any XML, so there is no general 1:1 relationship of XML and a table structure you could extract from any XML.

All this will fall into place once you really just dig into what XML is, maybe start much smaller and try a bit of HTML, XML with a fixed set of nodes that also have a fixed meaning, it's all about why this all is nested and how easy you already can have nesting errors by start tags appearing in a paernt tag pair, while end tag is outside of it, just like [ { ] } wouldn't be a valid bracketing. XML is very fragile, actually because it's open to anything.

To get more concrete again, for example Twitter has a very good documentation. It once also was about the optional XML results you could get, today it all seems JSON, but a good portion of JSON nesting nodes still is the same as XML. And here's a very good example of response fields: https://developer.twitter.com/en/docs/twitter-api/...

It's still important to understand that this notion is more about what object properties are, not table fields. And the field types also being arrays tell you about the properties that area actually a list of objects themselves, again, the nesting.

And the final final truth about this is ou can never expect any complex nor simple parsing extraction to always work and extract all information from the XML into an object or a table or a few tables and objects. But you can always start by the root XML node and traverse a tree "vertically" or "horizontally" first (drilling down children or visiting siblings. And in XML that should have tables in it none of both is really the best way forward, you need to already know what nodess being sections of repeated nodes you will find inside. Just like you know the very low level detail of finding an fname node. Just beware of the future case there are two or three and you only extract one.

Bye, Olaf.

Olaf Doschke Software Engineering
https://www.doschke.name

RE: XMLToCursor Parse Error

I can unserstand if you don't have the time to plow through this, even if you like me writing verbose about some topics.

The main thought can be condensed like this: XML is like a hierarchical database, it's still a database, but

1. You can't expect a general conversion routine able to put the hierarchical data into a relational model you're used to.
2. Even taking the schematic differences aside, you can't expect to be able to use a random database without knowing more details about it, what you find where

Working with some Webservice you can rely on the XML always having the same nature, you always get the same data structure, everything else makes no sense, but you always have to dig into this to know what you deal with and how you deal with it. Don't look for general recipes and don't ever expect nothing to stay as is forever.

In your case the major problem was that you already had a working extraction that stopped working, right? So something about the XML must have changed at a point. There is no defense against that than putting things into error handling and suppressing messages from users, instead log them and then look into them yourself and adapt. There is no auto adaption. In your case, you're nearer to it than you could normally get, but the decision to rename the nodes to firstname instead of fname can also still hit you just like the XML having multiple fname nodes, even though you're sure you only request XML about one contact. Some day it might contain child nodes about the contacts of the contact or have multiple fname because the key you use to query is not unique. And then this even gives no error, you just disregard any secondary name.

Bye, Olaf.

Olaf Doschke Software Engineering
https://www.doschke.name

RE: XMLToCursor Parse Error

Last not least, I can reproduce the error and think you actually found a VFP bug about the MSXML usage.

If I just make assumptions about the minimal XML that must surround the fname node so XMLToCursor starts parsing it XMLToCursor errors the way you posted. At minimum two surrounding node levels are necessary, when CursorToXML generates XML you get an outer node "VFPData" and inner nodes with alias/tablename that are records and then the innermost level nodes for each field. And that's also how XMLToCursor interprets any XML, if that's not what the nodes are you can get strange results.

CODE

QRZ_Lookup = '<?xml version="1.0" encoding="UTF-8" standalone="yes"?><xmldata><contact><mytag>.</mytag></contact></xmldata>'
? XMLToCursor(QRZ_Lookup,"cur_QRZ_Lookup") 

The xmlns doesn't matter at all, by the way.

And where does this come from? As I know XMLToCursor is msxml3 base I can look into what the msxml3.dll offers me via VFPs object browser.
Ther are, MSXML2.XMLDomDocument.3.0 and also a SAXXMLReader an MXXMLWriter, and these are playing a role behind the scenes of XMLToCursor. We can use this more directly:

CODE

QRZ_Lookup = '<?xml version="1.0" encoding="UTF-8" standalone="yes"?><cursor><contact><fname>.</fname></contact></cursor>'
oDoc    = CreateObject("msxml2.DomDocument.3.0")
oReader = CreateObject("msxml2.SAXXMLReader.3.0")
oWriter = CreateObject("msxml2.MXXMLWriter.3.0")
oDoc.LoadXML(QRZ_Lookup)
oWriter.byteOrderMark = .T.
oWriter.omitXMLDeclaration = .T.
oWriter.indent = .T.
oReader.contentHandler = oWriter
oReader.dtdHandler = oWriter
oReader.errorHandler = oWriter
oReader.putProperty("http://xml.org/sax/properties/lexical-handler", oWriter)
oReader.putProperty("http://xml.org/sax/properties/declaration-handler", oWriter)
oReader.parse(oDoc)
? oWriter.output 
This is just porting a simple SAX example from https://docs.microsoft.com/en-us/previous-versions...)

And while it's counterintuitive that this is just writing out XML in an XMLWriter which the XMLReader just read from a DOMDocument, which also in itself loaded the XML inside it and parsed it to an object model, this all shows, that the MSXML3 parser has no problem with a period as a node value. The output isn't simply a copy of the input, which you can see by its indentation. Change oWriter.indent = .F., and the output XML has no indentation, so the XML went through parsing, even twice from DOMDocument and the XMLReader.

I don't know where this comes from, but in the end I think it really is a VFP bug, not an MSML3 bug.

The C++ code for the XMLToCursor function does use MSXML3, of that I'm sure. But obviously it has a very VFP specific section as none of the MSXML3 classes will create a VFP cursor, that part of XMLToCursor is very VFP specific and must take the parser output and put it into creating a cursor with a certain structure populated with the values. And the bug is likely in that section. The only thing speaking against it is that the errro is clearly reporting a parse error, also in the way an XMLREader would report it. So likely the way I used the MSXML3 classes is not the way XMLToCursor uses them.

And then I am guided to https://www.qrz.com/page/current_spec.html about the XML Service QRZ offers and they describe their XML there. If I just search for fname there I think you do a callsign lookup.

Taking the XML example XMLToCursor just creates a cursor from the callsign portion, it's easy to extract both that and the Session section, too with STREXTRACT, but that just as a side note.
You're just a little lucky QRZ has this actual record data in this level, just right for cursortoxml to create a single record cursor.

Anyway, there's nothing special in this XML like an inline schema or DTD definition that would rule out a period in a node. You can take any node and replace its value with a period only and get this parsing problem. You have hit a very special case of MSXML3 and/or VFPs code about XMLToCursor failing in that case.

But aside of that, it's still the normal "interface" of XMLToCursor to error in case of any parsing problem. For example add a space inside a node name, ie turn <fname> to <f name> to cause a real XML error and you'll see you even get an exact position about the parse error. If you then use the MSXML3 classes instead of XMLToCursor you actually only get a general parse error with no position information, just that some name contains an invalid character.

What this mainly shows is, that you actually need to program for errors to happen, as I already said early on this is the way XMLToCursor or parsers work, they throw an error in case of a parse problem. That's their "user interface", or in that case "programmer interface". And that can also happen just because the XML text might have a transfer error. I don't think the period comes from a transfer error, someone actually entered a period as his/her name, but anything can happen and who knows which other specific node value causes parsing to fail, too. You never know.

I am not really surprised QRZs web service description can't compare to Twitter, for example, but it's simple enough, it's good enough. And since it's not any and all XML parsers erroring, you can't blame them. You were once lucky XMLToCursor works here and just disregards the outer nodes, and that you didn't need the data from the <Session></Session> section. If you look in detail you'll see the cursor you create does not contain the Key, Count etc. data.

Obviously there won't be a fix for this, s the problem isn't clearly located in MSXML3. To involve Microsoft in a fix someone would need to find out what MSXML3 component would fail on such a node, It's still a possibility and while XML3 is an old version, MS page https://support.microsoft.com/en-us/help/269238/li... says about the support lifecycle policy:

Quote (Microsoft)

MSXML 3.0 support follows the support policy of the OS into which it is built.
No idea what that means exactly, but the XML libraries in all their versions are quite central to Microsoft products.

Bye, Olaf.

Olaf Doschke Software Engineering
https://www.doschke.name

RE: XMLToCursor Parse Error

And I also tried the XML 4.0 variants of the OLE classes and let msxml2.SAXXMLReader.4.0 parse the XML with a period value node, just to be sure. MSXML3 and MSXML4 have no problem with such a node. But in case of introducing another XML error I also get only the XML node output, so the default error handling (set by oReader.errorHandler = oWriter) is silent but then parsing doesn't write XMLnodes and so you would detect a problem, if there was one.

To get to the final bottom of this one would need to analyze how VFP uses MSXML exactly. Sysinternal process monitor tool would perhaps show this, but that surely is beyond your interest.

Just looking at the XML example the structure is simple enough to work with StrExtract, you actually only have two records of data in the XML and when you're only interested in the fname, there always is only one in the overall XML as long as the interface is as it is.

Bye, Olaf.

Olaf Doschke Software Engineering
https://www.doschke.name

RE: XMLToCursor Parse Error


You can use nfXml ( https://github.com/VFPX/nfXML ) ,
wich parses xml into a vfp object using msXml.domDocument:

just do:

CODE

myObject = nfXmlRead( myXmlString )

? myobject.property ..

*if you need to save to cursor, 
*follow the common procedure for a list of objects:

create mycursor ( fld1 c(10), fld2 n(10,2).. )

for each rowObject in myObject.myarraynode
   insert into mycursor from name rowObject
endfor 

Check the readme and sample code!

Marco Plaza
@nfoxProject
https://www.github.com/nftools

RE: XMLToCursor Parse Error

(OP)
Hi Olaf,

Quote (Olaf)

I can understand if you don't have the time to plow through this, even if you like me writing verbose about some topics.

I do value your in-depth replies; but being an aged and recreational user of VFP it takes a while to digest and sink in. As you may gather from my previous posts, I also have difficulty in writing meaningful replies (I was the same when I was at Work, I struggled with writing reports); be rest assured, I often return to Tek-Tips to re-read my threads.

Quote:

Last not least, I can reproduce the error and think you actually found a VFP bug about the MSXML usage.

That must be a first for me!

I wouldn't have found the problem if the user had populated the XML Data with a Name instead of a period. The user should have left the Name Field Blank rather than a Period. As the User was Operating on behalf of an organisation he placed the organisations name in the family field.

Regards,

David.

Recreational user of VFP.

RE: XMLToCursor Parse Error

(OP)

Quote (Olaf)

In your case the major problem was that you already had a working extraction that stopped working, right? So something about the XML must have changed at a point.

The issue has only showed up once in over 5 years of use, so it would have been difficult (for me anyway) to foresee this happening, it was really down to user error whilst inputting data. Maybe the Website could write some code similar to VFP VALID.

Regards,

David.

Recreational user of VFP.

RE: XMLToCursor Parse Error

Quote:

The user should have left the Name Field Blank rather than a Period. As the User was Operating on behalf of an organisation he placed the organisations name in the family field.

Which suggests a design weakness in the original data-entry form.

I've occasionally come across this sort of thing myself, where I am doing something on behalf of my company, such as making a purchase or subscribing to a list. You are asked for your first name and last name rather than company name, so you end up splitting the company name over two fields, which doesn't make much sense and could lead to unintended results.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: XMLToCursor Parse Error

Mike: It atually is valid for char fields or XML nodes representing char data type to only contain a period. So the problem really is within MSXML and/or VFPs XMLToCursor implementation.

Quote (David)

so it would have been difficult (for me anyway) to foresee this happening

Yes, David, this is completely understandable. And indeed it means you never can write something specific to input coming from external, no matter if XML or any other input, that never breaks. You can't foresee any case.

What you could foresee, though, is any problem occurring. So TRY..CATCH and general error handling is good. Regarding inputs lie XML logging them or the last two/three into a file then also can help with error analysis. So it's very good you could at least find out it's a period in the XML node that was causing this.

I wrote a new News thread to make this public. Perhaps we'll see whether this becomes an MSXML fix, I don't think VFP will get even just a hotfix, maybe the Chinese developer of VFP10 (64bit) will also just report that his version doesn't have that bug or he could find it in the VFP code.

Anyway, I mostly pointed out the alternatives you have.

Bye, Olaf.

Olaf Doschke Software Engineering
https://www.doschke.name

RE: XMLToCursor Parse Error

To all,

I think the problem occurs in the process of VFP trying to infer the data type of fields it is reading from the XML document. If you set the schema or if you have a cursor ready to import, XMLTOCURSOR() won't raise an error.

From Olaf's demo in https://www.tek-tips.com/viewthread.cfm?qid=180588...

CODE --> VFP

Local lcXML

Text To lcXML NoShow
<?xml version="1.0" encoding="Windows-1252" standalone="yes"?>
<VFPData>
   <tablename>
      <fieldname>.</fieldname>
   </tablename>
</VFPData>
EndText

CREATE CURSOR crsFromXMLfails (fieldname Varchar(200))

On Error ? Message(), "in Line", Lineno()

XMLToCursor(lcXML,"crsFromXMLfails", 8192)
BROWSE 

RE: XMLToCursor Parse Error

(OP)

Quote (Olaf)

So TRY..CATCH and general error handling is good.

I must admit, I am very lax when it comes to error handling. Often I come across an error and then find a fix for it, which I know is know is wrong, prevention is better than a cure. As I am the end user, I am keen to put the code to good use.

I am improving, albeit at a slow pace.

Regards,

David.

Recreational user of VFP.

RE: XMLToCursor Parse Error

Nice,

so this narrows it down to the phase of schema inferring!? I'm not so sure, it's still reported as a parse error.
Clearly parsing is a necessary step in type inferring, so that is the reason for that, but parsing will be done anyway to get the values to put into cursor records.

So what's your guess, does VFP even use the XML reader for type inference?
As far as I look into DOMDocument loadxml it infers nodeTypes (element, endelement, ...), not data types, then NodeTypedValue is also always vartype char, just like the text property of a node. Is a DOM ever creating object properties that are numeric or other data types than char? XML surely has concepts of numeric nodes with xs:float, for example, but when and where at all would I see how an XML reader infers that data type, not node type, for an XML node? Or is it the part that is VFP specific?

Bye, Olaf.




Olaf Doschke Software Engineering
https://www.doschke.name

RE: XMLToCursor Parse Error

I see, David, when you're the end user of your code a simple error handling like
ON ERROR SET STEP ON 
would already be a nice add on to get into debug mode instantly when an error happens. There you'd still have hands on the variable QRZ_Lookup and the XML it contains, for example, but also any other current set of variables, callstack and other things just not available when you neither log nor otherwise handle errors.

Bye, Olaf.

Olaf Doschke Software Engineering
https://www.doschke.name

RE: XMLToCursor Parse Error

David,

May I pick up your point about being lax on error-handling.

The reason to build error-handling into your code is not to fix errors when you come across them in development. It's more to do with shielding the user from the consequences of an error. At the very least, a good error handler will notify the user of the error in a friendly way (rather than display a cryptic error message), and probably log the error and notify the developer.

In this particular case, using TRY / CATCH / ENDTRY would mean that your program can decide to ignore the problem (the dot in the name field), to report it to the user, or to log it in some way. It's all about graceful degradation. In other words, the code might not fully support the expected result (that is, the user expecting to be able to enter a dot in place of their name), but it doesn't crash the system either.

For these reasons, it would be worth familiarising yourself with the various ways of using ON ERROR and TRY / CATCH / FINALLY (and possibly the Error method in objects, although that is something that I have never used, rightly or wrongly).

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: XMLToCursor Parse Error

Mike,

as far as I see the part of user-friendly error handling when the developer is the user, it can get much simplified. SET STEP ON as handling could also become an annoyance just like ON ERROR * can make you unsuspicious about any errors. And indeed SET STEP ON isn't necessary as the default system handler has the suspend button in the error messagebox. But it's a shortcut into the debugger.

Also often enough I just print error messages especially programming something nonvisual or let it go to debugout. That's all single liners not even needing to write an error handler prg or a function within main.prg.

I did an extensive error handler for end users, though, and by the users used to other VFP applications they actually didn't like what would be most responsible: To always quit after any error and prevent any mischiefs. The worst case of a ZAP after a SELECT failed to select the workarea was also no argument, it was too far fetched, and they too often had the case a minor bug could just be ignored to continue.

I also used RETURN TO MASTER as a compromise between exiting the application and just cancelling what's currently done. If you have an application object with a readevents method in which you, well, put the command READ EVENTS, then RETURN TO MASTER simply cancels anything on the stack that led to the error and you're back to waiting for the user to use menu, a form, whatever currently is the scope of the application. That's then leading to less frustration as the startup of enterprise applications tends to be lengthier.

Bye, Olaf.

Olaf Doschke Software Engineering
https://www.doschke.name

RE: XMLToCursor Parse Error

2
Olaf,

I don't want this thread to get too side-tracked on the subject of error handling. But of course I agree that it can be a nuisance when the developer is the user. I have a fairly sophisticated error-handler that does a lot of logging and notifying, and also deals with reverting buffers, rolling back transactions, etc. But I call it like this:

CODE -->

IF <we are in the run-time environment)
  DO ERROR WITH <etc.>
ENDIF 

In the development environment, I agree that we want to suspend the program and get to either the debugger or the command window as easily as possible.

Regarding RETURN TO MASTER, I used to take that approach in pre-Visual days. And I started by trying to do something similar in VFP. But - rightly or wrongly - I decided that, if an error occurred (one that couldn't be handled at run time), then the best thing is to quit the application as gracefully as possible. You can't know if the error caused any side effects, such as closing a table or releasing a variable, which means that the application is inherently unstable and therefore needs to closed.

(Sorry, I said I didn't want to get side-tracked, but that's exactly what I've done.)

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: XMLToCursor Parse Error

(OP)

Quote (Mike Lewis)

I don't want this thread to get too side-tracked on the subject of error handling.

I'll start another thread with a link to this one.

Regards,

David.

Recreational user of VFP.

RE: XMLToCursor Parse Error

Quote (Olaf)

So what's your guess, does VFP even use the XML reader for type inference?

VFP uses the MSXML classes to parse XML documents and load them into cursors, but the determination of the data type of the resulting columns is VFP's complete responsibility, and it is based on the nodes' contents.

I think there are two levels of problems, here:
  1. VFP decides that a presence of single point determines a numeric value (which is wrong, of course, since a single decimal point is not a valid number) - note that it's not this phase that raises the error, VFP will just create a column of type N(1)
  2. When reading the actual data from the XML nodes, it does not gracefully degrade to an empty value like it does in other similar circumstances (an empty numeric content, or an invalid date) - and it's at this moment that an error pops up

RE: XMLToCursor Parse Error

Hm, shouldn't there be a cursor then, or is it just in some proto state?

OK, let's try when preparing a cursor with a non fitting data type, just like wrong type inference would do:

CODE -->

Local lcXML

Text To lcXML NoShow
<?xml version="1.0" encoding="Windows-1252" standalone="yes"?>
<VFPData>
   <tablename>
      <fieldname>.</fieldname>
   </tablename>
</VFPData>
EndText

CREATE CURSOR crsFromXMLfails (fieldname N(1))

On Error ? Message(), "in Line", Lineno()

XMLToCursor(lcXML,"crsFromXMLfails", 8192) 
Okay, yes, VFP "blames" the parser. because it cant eval("."). That means it happens after the type inferring stage simply when the cursor fields don't work or evaluate or val or whatever VFP uses for conversion of string to data type.

That makes me wonder how much of the XML is parsed for type inference. We know from the SQL engine quirks that inferring field width fails when the first result value is short or even empty, it's one of the buggy VFP behaviors you need to know. I have seen type inferring only taking a few rows. This sample shows VFP (or MSXML) will go through more than just the first record to infer the fieldtype.

CODE

Local lcXML

Text To lcXML NoShow
<?xml version="1.0" encoding="Windows-1252" standalone="yes"?>
<VFPData>
   <tablename>
      <fieldname>.</fieldname>
   </tablename>
   <tablename>
      <fieldname></fieldname>
   </tablename>
   <tablename>
      <fieldname>+</fieldname>
   </tablename>
   <tablename>
      <fieldname>-</fieldname>
   </tablename>
   <tablename>
      <fieldname>0</fieldname>
   </tablename>
   <tablename>
      <fieldname>9</fieldname>
   </tablename>
   <tablename>
      <fieldname>A</fieldname>
   </tablename>
   <tablename>
      <fieldname>Z</fieldname>
   </tablename>
</VFPData>
EndText

On Error ? Message(), "in Line", Lineno()

XMLToCursor(lcXML,"crsFromXMLworks") 

So it's not just ".", it's first inferring a numeric type from it and then failing to evaluate "." to a numeric value. You can see I tried to let VFP infer numeric for several rows, but it makes a full pass, I tried 2048 nodes with "." and then an "A" and it still doesn't error but infers char.

Which is good and bad news. That also means one value off the norm of, for example, really a numeric type can let VFP convert this to char field, just because once the XML has "." or "e", perhaps, or some other unusual value. When you're used to getting a numeric field and VFP then creates a char field the XMLToCursor might work, but your own code then fails. So it might also be a good idea to use the inferring with sample XML and during your initial development while it works, then store one sample result as DBF And use that as a template for further XMLToCursor conversions. In which case you fail less often, but can also fail when the XML changes and has more/other fields.

Bye, Olaf.

Olaf Doschke Software Engineering
https://www.doschke.name

RE: XMLToCursor Parse Error

Quote (Olaf)

Hm, shouldn't there be a cursor then, or is it just in some proto state?

XMLTOCURSOR() seems to follow these three steps, in case there is no schema or a target cursor in place:
  1. MSXML.Load the document
  2. Go through the nodes tree and build a mapping from XML elements to VFP columns, including the determination of the data type of the columns
  3. Go through the nodes tree again, and fetch the contents from the XML nodes into the VFP columns
If something goes wrong with step 3, VFP will create a cursor, nevertheless, and will fill with as many rows it's able to import without error.

That is, this will create a cursor with RECCOUNT() = 0

CODE --> VFP

Local lcXML

CLEAR

USE IN SELECT("crsFromXMLfails")

Text To lcXML NoShow
<?xml version="1.0" encoding="Windows-1252" standalone="yes"?>
<VFPData>
   <tablename>
      <fieldname>.</fieldname>
   </tablename>
</VFPData>
EndText

On Error ? Message(), "in Line", Lineno()

XMLToCursor(lcXML,"crsFromXMLfails") 

ON ERROR

BROWSE 

So will this (the import finishes at the first error):

CODE --> VFP

Local lcXML

CLEAR

USE IN SELECT("crsFromXMLfails")

Text To lcXML NoShow
<?xml version="1.0" encoding="Windows-1252" standalone="yes"?>
<VFPData>
   <tablename>
      <fieldname>.</fieldname>
   </tablename>
   <tablename>
      <fieldname>0</fieldname>
   </tablename>
</VFPData>
EndText

On Error ? Message(), "in Line", Lineno()

XMLToCursor(lcXML,"crsFromXMLfails") 

ON ERROR

BROWSE 

But if the dot value comes after some other valid numeric values, the RECCOUNT() will reflect the number of rows validly imported (in this case, RECCOUNT() = 1):

CODE --> VFP

Local lcXML

CLEAR

USE IN SELECT("crsFromXMLfails")

Text To lcXML NoShow
<?xml version="1.0" encoding="Windows-1252" standalone="yes"?>
<VFPData>
   <tablename>
      <fieldname>0</fieldname>
   </tablename>
   <tablename>
      <fieldname>.</fieldname>
   </tablename>
</VFPData>
EndText

On Error ? Message(), "in Line", Lineno()

XMLToCursor(lcXML,"crsFromXMLfails") 

ON ERROR

BROWSE 

On its own, XMLTOCURSOR() is a great function to quickly import data from an XML document into a VFP cursor for inspection, but, because of the decisions it takes on column mapping, I never use it in production code without a previously prepared schema or cursor.

For instance, in the following example, the imported data maybe not exactly what we would (or could) expect:

CODE --> VFP

Local lcXML

CLEAR

USE IN SELECT("crsFromXMLfails")

Text To lcXML NoShow
<?xml version="1.0" encoding="Windows-1252" standalone="yes"?>
<VFPData>
   <tablename>
      <fieldname>0</fieldname>
      <other>2020-02-30</other>
   </tablename>
   <tablename>
      <fieldname>1</fieldname>
      <other>2020-12-00</other>
   </tablename>
</VFPData>
EndText

On Error ? Message(), "in Line", Lineno()

XMLToCursor(lcXML,"crsFromXMLfails") 

ON ERROR

BROWSE 

RE: XMLToCursor Parse Error

OKay, I have to look into these code samples later. Nevertheless, XMLToCursor is not really any good unless your structure has the necessary type of nesting. if your XML only is about a few single key-values XMLToCursor won't help, neither inferring a cursor nor preparing one.

CODE

Local lcXML
CLEAR

USE IN SELECT("crsFromXMLfails")

Text To lcXML NoShow
<?xml version="1.0" encoding="Windows-1252" standalone="yes"?>
<person>
   <firstname>Olaf</firstname>
   <lastname>Doschke</lastname>
</person>
EndText

On Error ? Message(), "in Line", Lineno()
XMLToCursor(lcXML,"crsFromXMLfails") 
ON ERROR 


So in such cases, you'd just add a layer by surrounding it with an extra node to get a record from it. But even that won't work for any XML

CODE --> XML

<?xml version="1.0" encoding="Windows-1252" standalone="yes"?>
<person>
   <firstname>Olaf</firstname>
   <lastname>Doschke</lastname>
   <speaks>
      <naturallanguage>German</naturallanguage>
      <naturallanguage>English</naturallanguage>
      <programminglanguage>Basic</programminglanguage>
      <programminglanguage>6502 assembler</programminglanguage>
      <programminglanguage>Pascal</programminglanguage>
      <programminglanguage>68000er Assembler</programminglanguage>
      <programminglanguage>C/C++</programminglanguage>
      <programminglanguage>...</programminglanguage>
   <speaks>
</person> 

The default of reading in any XML should be object, shouldn't it? Especially since we have the empty class as basis, you can have a class that allows any name as property (unless you leave english or use two words, but XML disallows that as node names, too) and go to cursor from there or if you already know the structure allows so.

Bye, Olaf.

Olaf Doschke Software Engineering
https://www.doschke.name

RE: XMLToCursor Parse Error

OK, I see. I must have overlooked this all the time, the cursor is created even for the older examples. Bummer.

Anyway, at least the problem is settled to be specifically interpreting a period and more generally when not just inferring an unsuitable type but also failing conversion of the xmltext to that data type.

I tried your XMLSerializer class XMLtoVFP() method with my last XML example and to be fair it's not necessarily how multiple fields with same name would appear in XML, or is it? DOMDocument reads this in and I find all languages. I see your conversion puts collection objects as VFP object nodes, but I only find each first naturallanguage and programminglanguage in the result. If it would matter for real-world cases you'd rewritten that differently. I'll dig deeper into this, I may just use this wrong or look into the wrong nodes.

Bye, Olaf.

Olaf Doschke Software Engineering
https://www.doschke.name

RE: XMLToCursor Parse Error

Olaf, using the serializer to ingest your document, it could be something like this:

CODE --> VFP

LOCAL XMLS AS XMLSerializer
LOCAL XMLV AS Empty
LOCAL Source AS String

TEXT TO m.Source NOSHOW FLAGS 1
<?xml version="1.0" encoding="Windows-1252" standalone="yes"?>
<person>
   <firstname>Olaf</firstname>
   <lastname>Doschke</lastname>
   <speaks>
      <naturallanguage>German</naturallanguage>
      <naturallanguage>English</naturallanguage>
      <programminglanguage>Basic</programminglanguage>
      <programminglanguage>6502 assembler</programminglanguage>
      <programminglanguage>Pascal</programminglanguage>
      <programminglanguage>68000er Assembler</programminglanguage>
      <programminglanguage>C/C++</programminglanguage>
      <programminglanguage>...</programminglanguage>
   </speaks>
</person>
ENDTEXT

m.XMLS = CREATEOBJECT("XMLSerializer")

m.XMLV = m.XMLS.XMLtoVFP(m.Source)

* fetch the text from nodes directly

? "------- Natural languages --------"
? m.XMLV.person.Speaks.Naturallanguage(1).xmltext(1)
? m.XMLV.person.Speaks.Naturallanguage(2).xmltext(1)

* or programmatically, but assuming simple non-mixed text nodes

? "------- Programming languages --------"

LOCAL Cases AS Integer
LOCAL CaseIndex AS Integer

m.Cases = m.XMLS.GetArrayLength(m.XMLV.person.speaks.programminglanguage)

IF m.Cases != 0
	FOR m.CaseIndex = 1 TO m.Cases
		? m.XMLV.person.speaks.programminglanguage(m.CaseIndex).xmltext(1)
	ENDFOR
ELSE
	? m.XMLV.person.speaks.programminglanguage.xmltext(1)
ENDIF 

Showing all text from a point in a tree in one go is offered by the DOM as a convenience, but it's not strictly XML.

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members! Already a Member? Login

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close