Forum Discussion

mats2021's avatar
mats2021
New Contributor
4 years ago

Swedish char in filename

Hi all, 

 

I have a script that's read a xml tag and use that name to search för a mtaching XML fil.

It works fine sp long as I don't have any SWE char in the file name, but that is sometimes required.

 

context.content = groovy.xml.XmlUtil.serialize(new XmlParser().parse(groovyUtils.projectPath+"/mockResponses/${fileToLoad}.xml"))

 

fileToLoad can for example be svenljunga.xml

and 

högsby.xml and this one not work.

The char can be å ä ö

java.net.MalformedURLException: no protocol:

 

It's on Linux platform 

How can I come around this?

 

Thank's

 

 

  • Hey mats2021,

    I just reread your message and realised i hadnt read it properly. Id still set the charset as ISO_8859-1, HOWEVER, this is not in a payload but a filename that will be in a URL, right?
    In that case you need to ensure the swedish chars are html/percent encoded.
    Html/percent encoding encodes non english alphabet characters, so it'll change a forward slash ( '/') to %2F in a url.

    So an 'o' character with 2 horizontal dots above the character (i dont know what its called in swedish so i cant help search for that, but in english the two dots are knows as 'diaeresis' or 'umlaut' or 'trema' diacritics) encodes as %c3%b6. So in your url youd replace that 'o' with the diaeresis diacritic and replace that single character with its correct html encode of '%c3%b6'

    Hope ive been clear!

    Ta

    Rich
  • richie's avatar
    richie
    Community Hero
    Hey mats2021,

    You need to alter the supported charset readyapi is using. Im guessing UTF-8 (default for xml) does not include char support for Swedish characters otherwise you wouldnt be posting this message. I was gonna suggest UTF-16, but stackoverflow suggests use ISO-8859-1 as the relevant charset, so id changr my system properties char encoding to ISO-8859-1

    Ta

    Rich
    • richie's avatar
      richie
      Community Hero
      Hey mats2021,

      I just reread your message and realised i hadnt read it properly. Id still set the charset as ISO_8859-1, HOWEVER, this is not in a payload but a filename that will be in a URL, right?
      In that case you need to ensure the swedish chars are html/percent encoded.
      Html/percent encoding encodes non english alphabet characters, so it'll change a forward slash ( '/') to %2F in a url.

      So an 'o' character with 2 horizontal dots above the character (i dont know what its called in swedish so i cant help search for that, but in english the two dots are knows as 'diaeresis' or 'umlaut' or 'trema' diacritics) encodes as %c3%b6. So in your url youd replace that 'o' with the diaeresis diacritic and replace that single character with its correct html encode of '%c3%b6'

      Hope ive been clear!

      Ta

      Rich