Macro does not comply with Unicode codes

Creating a macro - Writing a Script - Using the API (OpenOffice Basic, Python, BeanShell, JavaScript)
Post Reply
ernsttremel
Posts: 3
Joined: Tue Dec 14, 2010 1:02 am

Macro does not comply with Unicode codes

Post by ernsttremel »

Hello,

I wrote this Makro
(cf. "DADABHAIBHARUCHAMakro-Transcript-Avestan-Felder_Makro.odt")
to do search and replace operations.

But it is treating unicode codes for lower case like those of upper case, e.g.

"a" = Chr(clng("&H61")) identically as "A" = Chr(clng("&H41"))

Cf. the attached "screenshot.png."

What has to be added to the macro code to avoid this unexpected macro behaviour?

Kind regards,
Ernst Tremel
Attachments
screenshot.png
DADABHAIBHARUCHAMakro-Transcript-Avestan-Felder_Makro.odt
(22.09 KiB) Downloaded 200 times
OpenOffice.org 3.2.1
OOO320m19 (Build:9505)
ooo-build 3.2.1.4, Ubuntu package 1:3.2.1-7ubuntu1
User avatar
Charlie Young
Volunteer
Posts: 1559
Joined: Fri May 14, 2010 1:07 am

Re: Macro does not comply with Unicode codes

Post by Charlie Young »

I'm not at all sure this is what the problem is, but it's probably worth looking at. By default, it appears that the property SearchDescriptor.SearchCaseSensitive defaults to FALSE. Try setting it to TRUE. That is, in your code, try

Replace.SearchCaseSensitive = TRUE.

Note that the ReplaceDescriptor seems to inherit this property from SearchDescriptor.
Apache OpenOffice 4.1.1
Windows XP
ernsttremel
Posts: 3
Joined: Tue Dec 14, 2010 1:02 am

Re: Macro does not comply with Unicode codes

Post by ernsttremel »

Thank you.
I first tried to place

Replace.SearchCaseSensitive = TRUE

here (*)
and then
there (x)
with th result, that the macro didn't work anymore.

Doc = ThisComponent
Replace = Doc.createReplaceDescriptor
(*)
For x = 1 To 50
(x)
Replace.SearchString = tr(x)
Replace.ReplaceString = av(x)
Doc.replaceAll(Replace)
Next x

End Sub
OpenOffice.org 3.2.1
OOO320m19 (Build:9505)
ooo-build 3.2.1.4, Ubuntu package 1:3.2.1-7ubuntu1
B Marcelly
Volunteer
Posts: 1160
Joined: Mon Oct 08, 2007 1:26 am
Location: France, Paris area

Re: Macro does not comply with Unicode codes

Post by B Marcelly »

Hi,
The code should work with Replace.SearchCaseSensitive = TRUE before the For loop.
But it entirely depends on your tables.

The characters in table av() may be incorrect. Verify them by printing their Unicode value, example:

Code: Select all

print "av1", asc(av(1))
print "av75", asc(av(75))
I find for all elements the value 55298.
You should use Unicode values instead of characters to fill the tabel av().

Your table is not complete, there are holes. For these holes the code replaces an empty string by an empty string. Maybe it does nothing wrong, but it is bad coding.
Bernard

OpenOffice.org 1.1.5 / Apache OpenOffice 4.1.1 / LibreOffice 5.0.5
MS-Windows 7 Home SP1
ernsttremel
Posts: 3
Joined: Tue Dec 14, 2010 1:02 am

Re: Macro does not comply with Unicode codes

Post by ernsttremel »

Thank you.
Indeed, the code is working with

Replace.SearchCaseSensitive = TRUE

before the For loop.

I corrected the macro by deleting all the holes.
The characters in table av() are all correct.

First I wanted to define these av() characters by

Character = chr(clng("&H2026"))

But this does only work for Hex numbers up to Hex FFFF not for bigger ones

Please have a look at
"gr.png" and "Hex-bigger-than-FFFF.jpg"
Attachments
Hex-bigger-than-FFFF.JPG
gr.png
OpenOffice.org 3.2.1
OOO320m19 (Build:9505)
ooo-build 3.2.1.4, Ubuntu package 1:3.2.1-7ubuntu1
B Marcelly
Volunteer
Posts: 1160
Joined: Mon Oct 08, 2007 1:26 am
Location: France, Paris area

Re: Macro does not comply with Unicode codes

Post by B Marcelly »

But this does only work for Hex numbers up to Hex FFFF not for bigger ones
That is normal.
OpenOffice internally handles characters as 16 bits Unicode : UTF-16. See Help F1 about Basic function Chr().
So you cannot manipulate characters encoded in UTF-32.

For your information, OpenOffice documents store characters as UTF-8, but this variable-length encoding is not adapted to software manipulations, hence the usage of UTF-16.

Ref : Wikipedia for Unicode
Bernard

OpenOffice.org 1.1.5 / Apache OpenOffice 4.1.1 / LibreOffice 5.0.5
MS-Windows 7 Home SP1
Post Reply