obfuscation comes handy to protect a copyrighted or a patented
software. Because Java is an interpreted language and the
intermediate bytecode is so standardized and well documented that it
can be decompiled to match nearly 100% of the original code, it
becomes necessary to obfuscate the intermediate bytecode so that it
cannot be easily decompiled to reveal the original source code
Methods used to
the debugging information such as variable names and line number
the names of variables, methods, package-names and classes.
all string literals and provide a function to decode them. This does
not affect the final output of the executable, but decompiled code
looks pretty ugly and not immediately understandable. Such encoding
can be done for other literals too like integer and float literals.
the cost of some efficiency, introduce code which is equivalent in
functionality but is reasonably more complicated by making use of
goto statements, unreasonable true conditions in statements,
expanded loops with some valid junk statements in between them.
code obfuscators can even insert some non-compilable statements in
the bytecode which do not affect the interpretation of the bytecode
but fails the decompilers as they are not able to decompile such
faulty code. Even if the decompilers succeed, their output of such a
non-compilable code is extremely difficult to understand. Bytecode
execution remains unaffected due to such buggy code insertion
because bytecode interpreters typically are very relaxed in error
checking assuming that the compiler would have already done that
extra unused code.
use of function overloading and provide same name to all the
functions with different signatures. Imagine understanding a code in
which all functions in a class (or in entire code), have been
renamed to ‘a()’
the line number information. Line number information is present in
bytecode to help debug a program and decompilers use this
information to more accurately construct the original source code.
So obfuscators mangle this information to confuse the decompilers
above methods can lead to problems in the actual execution of the
program sometimes if the decompiler is not careful to avoid the
Dynamic class loading (using
Class.forName() or ClassLoader.loadClass()) can fail if the package
or class names are mangled by the obfuscator. Although modern
decompilers are careful to replace static strings used in such
dynamic class loader function invocations, problems come when the
class/package name is an input from the user or is constructed
dynamically by string manipulation.
using reflection (example: Class.getMethod() or Class.getField())
clearly comes to a problem as #1 above if name mangling is performed.
not possible to deserialize an obfuscated class into a non-obfuscated
class and vice versa. So, care must be taken to use the same class
for serialization and deserialization. Either use obfuscated class
for both operations or use the non-obfuscated class. But do not mix
the code is expected to contain some well defined method names which
the callers of the class assume to be present. If such method names
are obfuscated, the code becomes unusable. This is especially true in
EJB where method signatures are more of conventions then a
specification in some interface or base class.
obfuscated stack trace of exceptions can be a nightmare to the
developer. So if the code is not too mature and being run for the
very first time without adequate bug fixing, then obfuscation may
lead to some big maintenance nightmares.
obfuscators however provide a utility which can reconstruct the
original stack trace of exceptions even from the obfuscated code.
This is done by keeping a reverse-mapping file of the obfuscations
performed on the code. Such a mapping file will have original as well
as mangled variable names along with the mangled line numbers. When
an exception comes in obfuscated code, these utilities lookup the
mapping file and construct the original exception as much as