Forum Discussion

eng_sysadmin's avatar
eng_sysadmin
Occasional Contributor
8 years ago
Solved

[Collaborator][Cache] How do I remove old files from the server cache for Collaborator 9.5.9501?

Our server cache is getting large and I'd like to archive the old reviews and remove their files from the server.  Archiving the old reviews is simple (Select the review and press "SAVE TO ZIP").  My problem is deleting the associated files from the server cache directory (<INSTALL_DIR>\tomcat\collaborator-content-cache).  

 

I could only find one relevant message board topic (https://community.smartbear.com/t5/Collaborator/CodeReviewer-Archive-Reduce-content-cache-which-inflates-which/m-p/92774/highlight/true#M1076).  The response was to manually review and remove old files/directories.  Of course, because the file names are hashes, you have no idea what file associates with which review.  So, this is really a brute force option that will have a lot of unintended consequences.

 

Does anyone have a better way to remove old server cache files?  It would be especially nice if you can target files associated with a specific review.

  • eng_sysadmin's avatar
    eng_sysadmin
    8 years ago

    I wrote the attached Perl script that archives the cache files for the input review number.  It's smart enough to not remove files used by later reviews and will create a TGZ with all the files associated with the review.  If you want to execute against a range of review, replace the (@ARGV) of the foreach loop with (1..n) to archive reviews 1 through n.

4 Replies

  • eng_sysadmin's avatar
    eng_sysadmin
    Occasional Contributor

    Can someone confirm if the following SQL will identify the files associated with a given review?

     

    SELECT a.review_id, b.version_id, b.version_filepath, b.version_contentmd5

    FROM review_version_list a, version b
    WHERE a.version_id = b.version_id AND a.review_id = <REVIEW#>;

      • eng_sysadmin's avatar
        eng_sysadmin
        Occasional Contributor

        The SQL statement was just to identify the files associated with a given review.  I have no intention of making changes to the database.  I just want someone to confirm if my statement does indeed identify the correct files.  Once confirmed, I plan to create a script that queries the database and then moves the files from the cache directory to an archive directory within the file system.

         

        I already saw the article you link to.  It describes a brute force archive method.  There is no way to know which review will be archived and which will not be archive.  Worse yet, it is very likely the method will end up archiving some, but not all the files for a review if its time frame overlaps the randomly chosen timestamp.  This method also requires a lot of manual traversing of the cache directory structure in order to identify the files that can be archived.  Finally, there is the statement "If you are doing document reviews, we don't recommend you archive any of the files in the content cache."