Forum Discussion

ravi_ava's avatar
Occasional Contributor
5 years ago

SOH and STX charcters are being replaced with quetion mark(?) in the soapui XML file.

We have been trying to create a project in soapui to invoke a rest api service. The rest API service POST method accept text with special charcters(SOH and STX) as request. When we are trying to save the project with .xml file, SOH and STX charcters are being replaced with question mark(?) in xml file.

Below configuration is used in soapui tool.

Service : REST API

Method: POST

Content-type: text/plain

Charset : utf-8


15 Replies

  • richie's avatar
    Community Hero
    Hi ,

    Just as an aside are you should you have the ContentType correct 'text/plain' as opposed to 'application/xml'? The reason i ask is that readyapi! won't treat your xml payload as'll treat it as though its just a string of characters, NOT xml.
    Obviously if your listening endpoint expects your xml as 'text/plain' then thsts something else as this defeats the object of sending it xml. Also it means that when your request hits your endpoint it means your endpoints parser wont do the eellformed xml check and wont be validatd against a schema unless the developers add extra code (that isnt required if the ContentType is correct) to handle this. I'd speak to the developers about this cos this is almost surely a defect.

    Anyway....back to the original point ....control char sequences.
    If you look at the top of your xml youll probably have an xml declaration that indicates xml version, character set, etc. By default xml uses utf-8, those 3 character sequences you mention (SOH/STX), arent being recognised by readyapi!'s xml parser, hence the reason im guessing those control char sequences are being replaced with ? chars.

    Im curious how youre representing those char sequences cos sticking SOH anywhere would make the xml malformed....i cant visualise how youre representing this in your xml....wpuld you publish the xml just to understand better please?
    Typically i would say you need to replace the problem character with its utf-8 supported escape sequence (e.g.   represents a space in xml), and you say your webservice accepts the control char sequences, however i know that control characters were banned in xml v1.0, so i'm surprised your endpoint should accept your requests if they contain the control chars, so maybe another reason to talk to your developers? Obviously xml v1.1s been around for quite a while.....first thing first, id raise these points with the author of your requirement to query if these control chars should in fact be included in your request payloads cos at first guess im thinking they shouldnt.

    Sorry about the ramble....i was typing out on my phone so cant see what ive written,


  • nmrao's avatar
    Champion Level 3
    Would it possible to show the problem using screen shots before and after save?
    • ravi_ava's avatar
      Occasional Contributor

      In the soapui, we are passing the JSON request as "helloworld".  But while saving the project as soapui.xml file, the request payload is converting from "helloworld" to "hello??world" in .xml file. (attached screen shots for more detials)

      • richie's avatar
        Community Hero

        Hey ravi_ava 


        ok so - first things first - you're passing "hello world" in .json - but the screenshot you've provided of the .request with the ."json payload" isn't actually wellformed .json - have you edited the screenshot before taking it - cos typically we'd really need that.


        Secondly - are you using SoapUI or ReadyAPI!?  the screenshot looks like SoapUI.....quite often it helps to diagnose and provide solutions if people know the application and version number to being used.


        It appears that the Ready.xml file being created is being created as xml v1.0 (defined in the xml declaration in yoru project file).  XML v1.0 doesn't support control characters - I said this in last post and wanted to make sure you noticed!), so they need to be converted BEFORE the project .xml file is saved - otherwise you will always get these ?? placeholders.


        Next you are injecting a json payload - right?  json's default charset is UTF-8.  You can only inject characters that have a reference within the UTF-8 charset alphanumeric list.


        I had a look around and found this which lists all the characters supported by UTF-8.


        Fortunately there are escape sequences for the control characters you care about (if there wasn't, you wouldn't be able to do anything except delete the chars from your payload):


        for the SOH character sequence replace 'SOH' with '' 

        for the STX character sequence replace 'STX' with '' 


        If you replace the control character sequences with the appropriate escape sequence (in the case above the hex but you can use others) - this will mean that the characters will be recognised by the UTF-8 (which is charset for both .json (request payload) and .xml (project file) and will get around the fact that control characters are banned in xml v1.0


        BTW - UTF-16 is a far greater charset than UTF-8 - it can be considered to be an extension of UTF-8 - so if the char exists in UTF-8 it will in UTF-16 (with the same escape sequences) (UTF-8 vs UTF-16 stuff)


        Hope this helps,


        nice one,




  • nmrao's avatar
    Champion Level 3
    Also would it be possible to show what you see in the system properties (Menu -> Help -> System Properties)?