[Solved] LibreOffice File format error found at SAXParse

Help with installation and general system troubleshooting questions concerning the office suite LibreOffice.
Post Reply
markhand
Posts: 3
Joined: Sun Jan 08, 2017 10:11 pm

[Solved] LibreOffice File format error found at SAXParse

Post by markhand »

Hi,

I'm desperately hoping someone can help. My partner needs to submit the attached in the next 12 hours but on her final save the file has corrupted. We'd be so grateful if someone could take a look and repair. It's a docx file... we have now absorbed the advice from previous posts about saving in odt format.

The error is:
File format error found at
SAXParseException: '[word/document.xml line 2]: Attribute w:eastAsiaTheme redefined
', Stream 'word/document.xml', Line 2, Column 11298(row,col).

Thanks in advance,
Mark
Last edited by Hagar Delest on Mon Jan 09, 2017 9:01 am, edited 2 times in total.
Reason: tagged [Solved].
Libre Office 5.2.0.4 Windows 10 Home 1607
User avatar
Hagar Delest
Moderator
Posts: 32695
Joined: Sun Oct 07, 2007 9:07 pm
Location: France

Re: LibreOffice File format error found at SAXParse

Post by Hagar Delest »

Hi and welcome to the forum!

I opened the document.xml file with XMLCopy editor and deleted a style just after the VOICE 1 string where the file seemed to be cut.
Then only I could format the file with an XML scheme and it seems to have retrieved the text. I hope you lost the formatting only.
LibreOffice 24.2 on Xubuntu 24.04 and 7.6.4.1 portable on Windows 10
User avatar
RoryOF
Moderator
Posts: 34637
Joined: Sat Jan 31, 2009 9:30 pm
Location: Ireland

Re: [Solved] LibreOffice File format error found at SAXParse

Post by RoryOF »

Try the attached - check formatting is as desired. I did not have to remove any text, only a duplicate formatting command
Apache OpenOffice 4.1.15 on Xubuntu 22.04.4 LTS
User avatar
Hagar Delest
Moderator
Posts: 32695
Joined: Sun Oct 07, 2007 9:07 pm
Location: France

Re: LibreOffice File format error found at SAXParse

Post by Hagar Delest »

Rory's file is much better than mine!
LibreOffice 24.2 on Xubuntu 24.04 and 7.6.4.1 portable on Windows 10
User avatar
RoryOF
Moderator
Posts: 34637
Joined: Sat Jan 31, 2009 9:30 pm
Location: Ireland

Re: LibreOffice File format error found at SAXParse

Post by RoryOF »

I was able to find the duplicated formatting command and removed it.
Apache OpenOffice 4.1.15 on Xubuntu 22.04.4 LTS
User avatar
RoryOF
Moderator
Posts: 34637
Joined: Sat Jan 31, 2009 9:30 pm
Location: Ireland

Re: LibreOffice File format error found at SAXParse

Post by RoryOF »

If markhand's partner is writing another play, note that there is a radio play formatting template at
http://extensions.openoffice.org/en/pro ... g-template
Apache OpenOffice 4.1.15 on Xubuntu 22.04.4 LTS
User avatar
Hagar Delest
Moderator
Posts: 32695
Joined: Sun Oct 07, 2007 9:07 pm
Location: France

Re: LibreOffice File format error found at SAXParse

Post by Hagar Delest »

Strange. I first tried to format the XML file but it couldn't. I then deleted something after the Voice 1 string and it did format the file afterward, without the error about the duplicate formatting. I guessed it was fine. But your finding was better.
LibreOffice 24.2 on Xubuntu 24.04 and 7.6.4.1 portable on Windows 10
User avatar
RoryOF
Moderator
Posts: 34637
Joined: Sat Jan 31, 2009 9:30 pm
Location: Ireland

Re: LibreOffice File format error found at SAXParse

Post by RoryOF »

I opened the archive that is the .docx file, found document.xml (in the internal word folder). I opened this in situ using XML Copy Editor, which reported a format error of duplicate attribute at line 2, 11950. I deleted that attribute, tested for "well formed" in XML C.E., got another duplicate attribute report at the same location, deleted that. This time XML C.E. reported that the file was well formed. I Saved it and was prompted to update it in the .docx archive. The updated .docx archive was what I uploaded.

XML Copy Editor is OK, but its diagnostic messages are not as explicit as they might be. I would welcome a pointer to a more explicit XML analyser that ran on linux, preferably running in a GUI as this is easier for file repairs.
Apache OpenOffice 4.1.15 on Xubuntu 22.04.4 LTS
markhand
Posts: 3
Joined: Sun Jan 08, 2017 10:11 pm

Re: LibreOffice File format error found at SAXParse

Post by markhand »

Thank you so much Hagar Delest and RoryOF
I have downloaded both files. Tears of joy have been shed and the deadline shall be met.
If it's not too much trouble, I'd be grateful if you could delete the repaired file attachments from your posts.
We'll check out the radio play template.
Many thanks,
Mark
Libre Office 5.2.0.4 Windows 10 Home 1607
User avatar
RoryOF
Moderator
Posts: 34637
Joined: Sat Jan 31, 2009 9:30 pm
Location: Ireland

Re: LibreOffice File format error found at SAXParse

Post by RoryOF »

markhand wrote:Thank you so much Hagar Delest and RoryOF
I have downloaded both files. Tears of joy have been shed and the deadline shall be met.
If it's not too much trouble, I'd be grateful if you could delete the repaired file attachments from your posts.
We'll check out the radio play template.
Many thanks,
Mark
All deleted, including Hagar's.
Apache OpenOffice 4.1.15 on Xubuntu 22.04.4 LTS
markhand
Posts: 3
Joined: Sun Jan 08, 2017 10:11 pm

Re: LibreOffice File format error found at SAXParse

Post by markhand »

Thank you both!
Libre Office 5.2.0.4 Windows 10 Home 1607
John_Ha
Volunteer
Posts: 9584
Joined: Fri Sep 18, 2009 5:51 pm
Location: UK

Re: LibreOffice File format error found at SAXParse

Post by John_Ha »

RoryOF wrote:XML Copy Editor is OK, but its diagnostic messages are not as explicit as they might be. I would welcome a pointer to a more explicit XML analyser that ran on linux, preferably running in a GUI as this is easier for file repairs.
There seem to be quite a few when I Google for them. You could try an on-line system like Validate an XML file.

When all else fails, deleting all the XML tags is pretty simple and gets back just the text. Use a Regular Expressions Find and Replace with search argument <[^>]+> and replace argument blank. You cannot do it in Writer as it produces a single paragraph which will often be over the 64k limit so I use Notepad++ to do it.
LO 6.4.4.2, Windows 10 Home 64 bit

See the Writer Guide, the Writer FAQ, the Writer Tutorials and Writer for students.

Remember: Always save your Writer files as .odt files. - see here for the many reasons why.
John_Ha
Volunteer
Posts: 9584
Joined: Fri Sep 18, 2009 5:51 pm
Location: UK

Re: LibreOffice File format error found at SAXParse

Post by John_Ha »

markhand wrote:we have now absorbed the advice from previous posts about saving in odt format.
That is excellent advice to follow!
LO 6.4.4.2, Windows 10 Home 64 bit

See the Writer Guide, the Writer FAQ, the Writer Tutorials and Writer for students.

Remember: Always save your Writer files as .odt files. - see here for the many reasons why.
Post Reply