Java Output Folder
Last revised 15:45 Friday October 12, 2001
Work item: "Support for dealing with class files generated by external
Java compilers like javac and jikes from an Ant script."
Here's the crux of one problem (from WSAD, via John W.):
In some environments the client has limited flexibility in how they
structure their Java projects. Sources must go here; resource files here;
mixed resource and class files here; etc.
-
There are situations where the client needs to place additional class files
and resource files in the output directory.
-
There are situations where the client needs to generate class files into
an existing folder filled with their class files and resource files (e.g.,
an exploded WAR file).
When clients attempts either, they discover that (a) the Java model and
builder ignore any class files in the output folder, and (b) from time
to time these files in the output folder get deleted without warning.
The Java builder was designed under the assumption that it "owns" the
output folder. The work item, therefore, is to change the Java builder
to give clients and users more flexiblility as to where they place their
source, resource, library class, and generated class files.
Current Functionality
Eclipse 1.0 Java builder has the following characteristics (and inconsistencies):
-
The class files generated by the Java builder go in a single output folder.
Java source files go in one or more source folders. Any other kind of files
can be included in the source folder too; this includes pre-compiled class
files. All these other files will be automatically mirrored to the binary
output directory when a build is done. The mirror is maintained as the
source folder changes; damage made directly to the output folder gets repaired
no later than the next full build.
-
The output folder belongs to the Java builder. It summarily deletes files
from the output folder that it does not think belong there. It is not possible
to get away with adding files directly to the output folder. So you cannot
even mate the extra resource files with the class files manually.
-
When the project source and output folder coincide (perhaps at the project
itself), the builder behaves differently. It grants that source files belong
there, so it never deletes them. It also grants that non-class files belong
there, so it never deletes them either. But it assumes that all class files
are generated, and so it summarily deletes them on every full build, including
class files that were explicitly put there. This is inconsistent with the
way things work out when the output folder and the source folder do not
coincide. And it is not what you want if you need to mate other class files
with the generated ones.
Proposal
The Java model has 2 primitive kinds of inputs: Java source files, and
Java library class files. The Java builder produces one primary output:
generated Java class files. Each Java project has a build classpath listing
what kinds of inputs it has and where they can be found, and a designated
output folder where generated class files are to be placed. The runtime
classpath is computed from the build classpath by substituting the output
folder in place of the source folders.
Java "resource" files, defined to be files other than Java sources and
class files, are of no particular interest to the Java model for compiling
purposes. However, these resource files are very important to the user,
and to the program when it runs. Resource files are routinely co-located
with library class files. But it is also convenient for the user if resource
files can be either co-located with source code, or segregated in a separate
folder.
Ideally, the Java model should not introduce constraints on where inputs
and outputs are located. This would give clients and users maximum flexibility
with where they locate their files.
The proposal here has 4 separate parts. Taken in conjunction they remove
the current constraints that make it difficult for some clients to place
their files where they need to be.
[Revised proposal: Rather than write a completely new proposal, I've added
a note like to the end of each subsequent section describing a revised
proposal.]
Java Builder Attitude Adjustment
To appreciate the difficulties inherent with the Java builder sharing its
output folder with other folk, consider the following workspace containing
a Java project. Assume that this project has not been built in quite a
while, and the user has been manually inserting and deleting class files
in the project's output folder.
Java project p1/
src/com/example/ (source folder on build classpath)
Bar.java
Foo.java
Quux.java
bin/com/example/ (output folder)
Bar.class {SourceFile="Bar.java"}
Foo.class {SourceFile="Foo.java"}
Foo$1.class {SourceFile="Foo.java"}
Internal.class {SourceFile="Foo.java"}
Main.class {SourceFile="Main.java"}
From this arrangement of files (and looking at the SourceFile attributed
embedded in class files), we can infer that:
-
Bar.class came from compiling a source file named "Bar.java".
-
Foo.class, Foo$1.class, and Internal.class all came from compiling a "Foo.java".
(A single source file will compile to multiple separate class files if
it has nested classes or secondary non-public classes.)
-
There are no existing class files corresponding to "Quux.java".
-
Main.class came from compiling a source file named "Main.java", which the
workspace does't have.
Java Builder - Obsolete Class File Deletion
If the user was to request a full build of this project, how would the
Java builder proceed? Before it compile any source files, it begins by
deleting existing class files that correspond to source files it is about
to recompile. Why? Because obsolete class files left around (a) waste storage
and (b) would be available at runtime where they could cause the program
to run incorrectly.
In this situation, the Java builder deletes the class files corresponding
to Bar.java (i.e., Bar.class), to Foo.java (i.e., Foo.class, Foo$1.class,
and Internal.class), and to Quux.java (none, in this case). The remaining
class files (Main.class) must be retained because it is irreplaceable.
The Java builder takes responsibility for deleting obsolete class files
in order to support automated incremental recompilation of entire folders
of source files. Note that standard Java compilers like javac never ever
delete class files; they simply write (or overwrite) class files to the
output folder for the source files that they are given to compile. Standard
Java compilers do not support incremental recompilation: the user is responsible
for deleting any obsolete class files that they bring about.
If the Java builder is free to assume that all class files in the output
folder are ones that correspond to source files, then it can simply delete
all class files in the output folder at the start of a full build. If it
cannot assume this, the builder is forced to look at class files in the
output folder to determine whether it has source code for them. This is
clearly more expensive that not having to do so. By declaring that it "owns"
the output folder, the current builder is able to makes this simplifying
assumption. Allowing users and clients to place additional class files
in the output folder requires throwing out this assumption.
If the user or client is free to manipulate class files in the output
folder without the Java builder's involvement, then the builder cannot
perform full or incremental builds without looking at and deleting the
obsolete class files from the output folder corresponding to source files
being compiling.
Under the proposed change, the Java builder would need to look at the
class files in the output folder to determine whether it should delete
them. The only files in the output folder that the Java builder would
be entitled to overwrite or delete are class files which the Java builder
would reasonably generate, or did generate, while compiling that project.
-
The Java builder is entitled to overwrite class files in the output folder
that correspond to current source files. Any class file at such a path
is the Java builder's. Even when the actual contents of the class file
came from elsewhere, the builder is always entitled to delete them or overwrite
them with its contents.
-
The only files in the output folder that the Java model/builder would be
entitled to delete outright are ones that had been generated by the Java
builder when compiling this project but which no longer correspond to a
current source file. This permits the Java builder to clean up obsolete
class files that it knows it generated, or would have generated, on an
earlier build. It does not have the right to delete other class files,
even ones which do not correspond to a current source file. (Otherwise
the Java builder could justify deleting any class file that it does not
have corresponding source for.) Even for a full build, the Java builder
is not allowed to scrub all class files from the output folder (unless
it happens to know for a fact that the only class files in there ones it
generated).
-
The source file is an optional attribute of class files that is not generated
when debug info is suppressed (javac -g:none). Class files in the output
folder without the SourceFile attribute should be treated as if there was
no corresponding source file. This means they never get deleted outright,
although they may still be overwritten as required.
-
Note: changing a project to give it a different output folder should absolve
the Java builder of responsibility for any generated class files in the
former output folder. This means the Java builder does not need to perform
cleanup or track anything outside the current output folder.
-
Note: adding a source entry to the build classpath causes a bunch of new
source files to enter the frame. Some of the existing class files in the
output folder might now map to these source files, possibly in preference
to where they mapped before. Removing a source entry from the build classpath
causes a bunch of source files to leave the picture. Some of the existing
class files in the output folder might now map to other source files, or
not map to any soure file at all. [We need to decide whether obsolete class
files need to be tracked across the additional and/or removal of source
entries from the build classpath.]
This change is not a breaking API change. The old spec said that the Java
model/builder owned the output folder, but didn't further specify what
all that entailed. The new spec will modify this position to allow clients
to store files in the output folder; it will promise that these files are
perfectly safe unless they are in the Java builder's direct line of fire.
Java Model - Obsolete Class File Deletion
There is another facet of the obsolete class file problem that the Java
builder is not in a position to help with.
If the source file Foo.java were to be deleted, its three class files
become obsolete and need to be deleted immediately. Why immediately?
Consider what happens if the class files are not deleted immediately. If
the user requests a full build, the Java builder is presented with the
following workspace:
Java project p1/
src/com/example/ (source folder on build classpath)
Bar.java
Quux.java
bin/com/example/ (output folder)
Bar.class {SourceFile="Bar.java"}
Foo.class {SourceFile="Foo.java"}
Foo$1.class {SourceFile="Foo.java"}
Internal.class {SourceFile="Foo.java"}
Main.class {SourceFile="Main.java"}
Since a full build is requested, the Java builder is not passed a resource
delta tree for the project. This means that the Java builder has no way
of knowing that Foo.java was just deleted. The Java builder has no choice
but to retain the three class files Foo.class, Foo$1.class, and Internal.class,
just as it retains Main.class. This too is a consequence of allowing the
Java builder to share the output folder with the user's class files.
If the obsolete class files are not deleted in response to the deletion
of a source file, these class files will linger around. The Java builder
will be unable to get rid of them.
The proposal is to have the Java model monitor source file deletions
on an ongoing basis and identify and delete any corresponding obsolete
class files in the output folder. This clean up activity must handle the
case of source files that disappear while the Java Core plug-in is not
activated (this entails registering a Core save participant).
Since deleting (including renaming and moving) a source file is a relatively
uncommon thing for a developer to do, the implementation should bet it
does not have to do this very often. When a source file in deleted, its
package name gives us exactly which subfolder of the output folder might
contain corresponding class files that might now be obsolete. In the worst
case, the implementation would need to access all class files in that subfolder
to determine whether any of them have become obsolete. In cases where there
is more than one source folder on the builder classpath, and there is therefore
the possibility of one source file hiding another by the same name, it
is necessary to consult the build classpath to see whether the deleted
source file was exposed or buried.
Implementation Tricks
Some observations and implementation tricks that should help reduce the
space and time impact of doing this.
-
When one or more source files are deleted from a single source folder,
their position under the source package fragment root gives us the package
name. This package name tells us exactly which subfolder of the output
folder might contain corresponding class files that might now be obsolete.
In the worst case, the implementation would need to access all class files
in that subfolder to determine whether any of them have become obsolete.
This shows that you only need information about a small portion of the
output folder in order to process one or more deletions within a single
source folder.
-
A source file named Foo.java typically compiles to a single class file
named Foo.class. There might be more class files (for nested classes and/or
secondary non-public types); and there might be less (when the source file
contains only non-public types with names other than "Foo"). When recording
the extracted source file name information, only the exceptional cases
need to be represented explicitly. For example, only Foo$1.class (derived
from Foo.java) and Internal.class (derived from Foo.java) are unusual;
Bar.class, Foo.class, and Main.class are all derived from source files
with the expected name. This means you can store the information extracted
from class files much more compactly that a simple class file name to SourceFile
string mapping.
-
There is often only one source folder on the builder classpath. In this
case, all source files in the source folder get compiled; there is no possibility
of one source file "hiding" another by the same name. This observation
can be used to avoid checking for source file hiding.
When all else fails
A special concern is that the user must be able to recover from crashes
or other problems that result in obsolete class files being left behind
in the output folder. It can be very bad when this kind of thing happens
(and it does happen, despite our best efforts), and can undercut the user's
confidence in the Java compiler and IDE. In a large output folder that
contains important user files, the user can't just delete the output folder
and do a full build. The user has no easy way to distinguish class files
with corresponding source from ones without. A simple way to address this
need would be to have a command (somewhere in the UI) that would delete
all class files in the output folder for which source code is available
("Delete Generated Class Files"). This would at least give the user some
help in recovering from these minor disasters.
[Revised proposal: The Java builder remembers the names of the class
files it has generated. On full builds, it cleans out all class files that
it has on record as having generated; all other class files are left in
place. On incremental builds, it selectively cleans out the class files
that it has on record as having generated corresponding to the source files
that it is going to recompile. There is no need to monitor source file
deletions: corresponding generated class files will be deleted on the next
full build (because it nukes them all) or next incremental build (because
it sees the source file deletion in the delta). The Java builder never
looks at class files for their SourceFile attributes. A full build always
deletes generated class files, so there's no need to a special UI action.]
Allowing Folders
to Play Multiple Roles
The proposed change is to consistently allow the same folder to be used
in multiple ways on the same build classpath.
-
As source folder and as output folder.
-
N.B. This is currently supported (e.g., when folder is the project root).
-
Allows generated class files to be co-located with Java source files.
-
Since output folder is automatically included on runtime classpath, this
arrangement would automatically make any class files or resource files
available at runtime.
-
However, these class files would not be seen at compile time library folder.
-
Recommendation: when class files or resources are present in a folder,
there should always be a library folder entry on the build classpath for
it.
-
As source folder and as library folder.
-
N.B. This is currently disallowed.
-
Allows library class files to be co-located with Java source files.
-
Allows resource files to be co-located with Java source files.
-
In virtue of being a library entry on the build classpath, the folder is
used at compile time for library class files and is included on the runtime
classpath.
-
As library folder and as output folder.
-
N.B. This is currently disallowed.
-
Allows library class files to be co-located with generated class files.
-
Allows resource files to be co-located with generated class files.
-
Remove duplicate entry when forming the runtime class path.
-
Note that the generated class files in this library folder are ignored
by the builder because it has source for all these by definition.
-
As source folder and as output folder and as library folder.
-
This is just a combination of all of above.
-
Allows library class files, generated class files, and resource files to
be co-located with Java source files.
-
Simple "one folder Java development" setup for someone with library class
files and possibly resources.
This change is not a breaking change; it would simply allow some classpath
configurations that are currently disallowed to be considered legitimate.
The API would not need to change.
[Revised proposal: Many parts of the Java model assume that library
folders are relatively quiet. Allow a library folder to coincide with the
output folder would invalidate this assumption, which would tend to degrade
performance. For instance, the indexer indexes libraries and source folders,
but completely ignores the output folder. If the output folder was also
a library, it would repeatedly extract indexes for class files generated
by the builder.
N.B. This means that the original scenario of library class files
in the output folder is unsupportable.
Allowing source folder to coincide with a library folder would be allowed.]
Completely
eliminate resource file copying behavior
The current Java builder copies "resource" files from source folders to
the output folder (provided that source and output do not coincide). Once
in the output folder, the resource files are available at runtime because
the output folder is always present on the runtime class path.
This copying is problematic:
-
Copying creates duplicates of resource files.
-
Takes up extra disk space.
-
Copying resources takes extra time.
-
Increases risk of user confusion (modify the copy).
-
Copying is out of character for Java builder.
-
Java builder should compile Java source files to binary class files.
-
Copying behavior is quirky.
-
Resources are never copied from a source folder that coincides with the
output folder.
-
Resources are copied from a source folder that does not coincide with the
output folder, even if the output folder happens to be another source folder.
-
Modifying the copy and building causes the file to be deleted (!); it is
replaced by a fresh copy on the next full build.
-
When there are several resource files with same name, it is impossible
to reliably control which one ends up in the output folder.
-
When the project source is the project itself, and the output is in a folder
under the project, the builder copies the .classpath file into the output
folder too.
The proposal is to eliminate this copying behavior. The proper way to handle
this is to include an additional library entry on the build classpath for
any source folders that contain resources. Since library entries are also
included on the runtime classpath, the resource files contained therein
will be available at runtime.
We would beef up the API specification to explain how the build classpath
and the runtime classpath are related, and suggests that one deals with
resource files in source folders using library entries. This would be a
breaking change for clients or users that rely on the current resource
file copying behavior.
The clients that would be most affected are ones that co-locate their
resource files with their source files in a folder separate from their
output folder. This is a fairly large base of customers that would need
to add an additional library entry for their source folder.
It would be simple to write a plug-in that detected and fixed up the
Java projects in the workspace as required. By the same token, the same
mechanism could be built in to the Java UI. If the user introduces a resource
files into a source folder that had none and there is no library entry
for that folder on the build classpath, ask the user whether they intend
this resource file to be available at runtime.
(JW believes that WSAD will be able to roll with this punch.)
[Revised proposal: Retain copying from source to output folder where
necessary.
-
Source folder different from output folder, no additional source folders:
copy resources from source folder to output folder (current behavior).
-
Source folder different from output folder, additional source folders:
copy resources from all source folders to output folder honoring build
classpath ordering (current behavior).
-
Source folder same as output folder, and no additional source folders:
no copying (current behavior).
-
Source folder same as output folder, and additional source folders: error
(new behavior).
This eliminates the screw case where resources get copied from one source
folder into another source folder, possibly overwriting client data.]
Minimize
the opportunity for obsolete class files to have bad effects
The Java compiler should minimize the opportunity for obsolete class files
to have bad effects.
Consider the following workspace:
Java project p1/
src/com/example/ (source folder on build classpath)
C1.java {package com.example;
public class C1 {}}
C2.java {package com.example;
public class C2 extends Secondary {})
lib/com/example/ (library folder on build classpath)
C1.class {from compiling
an old version of C1.java
that read
package com.example; public class C1 {}; class Secondary {}}
C2.class {from compiling
an old but unchanged version of C2.java}
Secondary.class {from compiling
an old but unchanged version of C2.java}
Quux.class {from compiling
Quux.java}
Assume the source folder precedes the library folder on the build classpath
(sources should always precede libraries).
When the compiler is compiling both C1.java and C2.java, it should not
satisfy the reference to the class com.example.Secondary using the existing
Secondary.class because the SourceFile attributes shows that Secondary.class
is clearly an output from compiling C1.java, not an input. In general,
the compiler should ignore library class files that correspond to source
files which are in the process of being recompiled. (In this case, only
Quux.class is available to satisfy references.) The Java builder does not
do this.
Arguably, the current behavior should be considered a bug. (javac 1.4
(beta) has this bug too.) Fixing this bug should not be a breaking change.
When the SourceFile attribute is not present in a class file, there
is no choice but to use it.
[Revised proposal: Maintain current behavior.]