Forum Discussion

mspatel's avatar
mspatel
Contributor
10 years ago

Test COmplete and Perforce Version Control

Hi 

We been facing this strange problem where several statements in Test Complete Scripts that are  checked in Perforce version control get converted into "Japanese Characters". Unfortunately i am not able to recreate the issue if i try intentionally. 

 

- This problem started occuring after the upgrade to TestCOmplete 11. 

-  could it be due to accessing script by multiple users ? 

- could it be related to  the way changes are checked in , parent object vs individual object

 

This issue is serioulsy taking lot of time out of our team 

 

 

Thannks 

 

 

 

  • brk9394's avatar
    brk9394
    Occasional Contributor

    Hi,

    We had the same issue several months ago, where about 1/2 of our TC files had become corrupted in this way. Here's the "short" summary of how I resolved it.

    The basic problem stems from a file encoding compatibility issue between Perforce(P4) & TC (although I'm not exactly sure what triggered it, but we only saw the issue after adding a 2nd user); The theory was that when TC checks-out P4 files that were previously saved using a text editor, then there is a chance that the encoding conversion of the CR/LF characters gets messed up in the process and adds an extra "OD" byte to the linefeed.

    In your screenshot example, every 2nd line is 'Japanese', because there is likely an extra byte on the linefeed charater which offsets the ascii values by one byte, which essentially "converts" every 2nd line as UTF-16 (2-bytes) (used by non-english languages such as Japanese). The next linefeed is offset by one more byte which effectivly corrects the issue for that line. Also if I recall, the corrupted files may have had a double byte-order-mark (BOM), which was another symptom of the corrupted file, but I'll explain this later.

    We did report the issue to Perforce (Feb. 2015), and I believe it was planned to be fixed. If you contact their support they hopefully should be aware of the bug. I haven't followed up with them though.

    A frustrating part with the problem was that even after reverting a file back in Perforce to a pre-corrupted state (e.g. head revision), it does not solve the problem. This is because of how P4 handles the 'encoding type' file information. But, there is a solution (or at least a workaround)...

    Here's what I'd suggest:

    PART 1: fix the files
    - check everything into P4, then manually backup/archive your perforce workspace
    - download HxD (a good & free hex editor)
    - close P4 & TC
    - open the corrupted file using HxD (probably *.vbs files for VBScript)
    - press Ctrl+R (Find/Replace)
    - search for: "0D 00 0D 0A" (no quotes) -- this is the erroneous CR/LF ascii hex values
    - Replace with: "0D 00 0A" (no quotes)
        ** I'm assuming here that your VB Script files were corrupted in the same manner as my JScript files; if this is not the case, then it may not work. If it doesn't, then if you can post a small sample file, I can look at it.
    - set Datatype = Hex-values
    - then proceed to replace all
    - repeat as needed for each corrupted file
    - open the file in a text editor to confirm it is fixed (such as Notepad++); also, pay attention to the file encoding, which is displayed in the bottom-right corner in the N++ status bar.
    - also, you may need to manually delete the byte order mark at the beginning of the file (BOM) (sorry I don't remember if this was required); I think I may have had to delete some double BOM's using HxD where there was both UTF-8 and UTF-16 BOM's. Deleting both should reset the encoding to ANSI as default.

    PART 2: prevent the problem from recurring
    - update P4 clients and server to the latest P4 version (they may have a fix by now, not sure)
    - set all your TC Projects Properties' Units Encoding to UTF-8 (or ANSI, but I wouldn't use Auto or UTF-16, because I suspected that 'Auto/UTF-16' may have also been part of the problem).
    - then edit each repaired file in TC (any change will do; for example, just add a space somewhere -- but there has to be a distinct change in the file), then save and check-in to P4

    PART 3: multi-users in Perforce
    - I also suspect that multi-users (P4-TC) is part of the cause to this issue
    - without getting into details in this thread, there is a major limitation in TC with the default P4 integration. By default, every TC user must use the same P4 workspace, but that is virtually useless in a multi-user setup. However, I found a workaround for this as well. Using a text editor, simply delete the workspace connection information in the project suite file (*.pj)(search for 'auxpath' in the file). Once the workspace connection is gone, then you will be able to use the P4 integration in TC as expected.

    Hope that helps,
    Brian

     

    • mspatel's avatar
      mspatel
      Contributor
      Wow .... Thanks a lot ... This is really helpful .... I am surprised no one from smartbear acknowledge this issue .... I have spend hours and hours fixing these scripts and after sometime they are back to state ..... And yes your suspect is right , it actually started after one of my colleague started working parallel to me . Councidentlyy tc11 was rolled out at the same time . Regardless this is major major issue , I m gonna try your work around but smartbear should be very proactive about all these issues... It dents their product reputation. ....
      • brk9394's avatar
        brk9394
        Occasional Contributor

        You're welcome. I'd be curious to know how it goes when you get a chance to try it. Cheers.

    • mspatel's avatar
      mspatel
      Contributor

      So I followed suggested steps except the last one 

       

      "- also, you may need to manually delete the byte order mark at the beginning of the file (BOM) (sorry I don't remember if this was required); I think I may have had to delete some double BOM's using HxD where there was both UTF-8 and UTF-16 BOM's. Deleting both should reset the encoding to ANSI as default." 

       

      I see some statements converted back to English but i lost some as well. I Have attched file comparison of before and after Hex replacements. 

       

      Any hint whats missing here ? 
       

       

      • brk9394's avatar
        brk9394
        Occasional Contributor

        Yes, I think I know what the problem is. Our starting points are different -- it is most likely that your corrupted file encoding is different than what mine was.

         

        Mine was UTF-16 (LE). (LE = Little Endian). See attached file for sample (yellow highlighting indicates encoding).

         

        If you look at the first 2 or 3 bytes of your original corrupted file in HxD, it's probably either UTF-8 (hex=EF BB BF), or UTF-16 (BE) (hex=FE FF). It's possible, but very unlikely, that you have one of the other 5 UTF encoding types.

         

        From my screenshot, and for my encoding type, the actual Find/Replace values should have been to find "0D 00 0D 0A 00" and replace with it with "0D 00 0A 00". (sorry, in my original post, it worked for me, but was not 100% accurate because it was offset by 1 byte). Please compare both screenshots to see the Find/Replace difference.

         

        However, based on your encoding, it will probably require a slightly different treatment. But without seeing the hex values, I can't tell you what the Find/Replace values should be.

         

        So I would suggest to post a screenshot of the first few lines of your original corrupted file in HxD, or the original file itself (but be careful of propritary content as this is a public forum). Then I can tell you what the Find/Replace should be for your files.

         

        Regards,

        Brian