Deleting JS comments from JSP pages
I have been asked to delete all the JS comments from JSP
page. The task is tedious and daunting. Deleting even a single unexpected
character can make the application unstable. If I make a 2-3 year old stable
system, unstable, I am fired. And If I have to sit and delete all the comments
manually, I will set myself on fire. Seriously..
So this is the time to do some brainstorming to avoid above
two situations. I started with regex to find the script block in JSP. After 1-2
hours of struggle, I have found out regex
<script[\\S\\s]*?>
- This regex will find all the opening script tag.
<script[\\S\\s]*?((/>)|(</script>)) - This will find all the script block which
include external js file.
</script - This
regex will find all the end script tag. (Attention: no closing angle bracket at
the end. This is to avoid </script --%>. Yes, some developer can comment
the entire script block with jsp comments.)
No I have 3 tools to strip off all the script blocks from
JSP. I am doing this to make sure that I am deleting only JS comments and
nothing else.
Now second step is to delete all multi-line comments first.
Why multi-line comments first, why not inline comments…
After 2 hours of testing multi-line first or inline first, I
have the answer, what would happen if I am deleting inline comments first and I
hit something like this
<script type="text/javascript">
/*
document.write("<h1>This
is a heading</h1>");
document.write("<p>This
is a paragraph.</p>");
document.write("<p>This
is another paragraph.</p>");
//*********************
//*/
</script>
Second and third comments will get deleted and next when I will
be deleting multi-line, entire multi-line comment will not get detected. And I
am screwed.
So, I decided to delete the multi-line comments first. What
I am trying to do is search a pattern inside a matched pattern. If you know how
to do this, please let me know..
I have tried some regex, here are my findings
This regex
(<script([\S\s]*?)>)(([\S\s]*?)([^:'\-"]//.*[^'"]$)([\S\s]*?))(</script>)
is keeping only first comments in backreference, hence only
first comment in script block can be deleted.
And this regex
(<script([\S\s]*?)>)(([\S\s]*?)([^:'\-"]//.*[^'"]$)([\S\s]*?))*(</script>)
only keeps the last comments in backreference, hence this
will not work too.
I have to delete all the comments in script block. I decided
to write a program to strip off all the script blocks from the JSP and then
delete all the multi-line comments first and then all inline comments.
Writing a java program from scratch and establishing all the
infra needs like logging, file and directory handling is wasting of time, focus
on main task. Yes I will write the task only: ANT Task.
All plan set, started to write a ANT task to delete all the
JS comments from JSP files.
I will discuss the ANT task in my next post.
No comments:
Post a Comment