A simplistic Java scanner. This scanner returns a sequence of tokens that can be used to
reconstruct the source code. Since the source code is coming from a string, the scanner in fact
just returns token boundaries rather than the tokens themselves.
We are not dealing with arbitrary user code so we can assume there are no exotic things like
tabs or Unicode escapes that resolve into quotes. The purpose of the scanner here is to
return a sequence of offsets that split the string up in a way that allows us to work with
spaces without having to worry whether they are inside strings or comments. The particular
properties we use are that every string and character literal and every comment is a single
token; every newline plus all following indentation is a single token; and every other string
of consecutive spaces outside a comment or literal is a single token. That means that we can
safely compress a token that starts with a space into a single space, without falsely removing
indentation or changing the contents of strings.
In addition to real Java syntax, this scanner recognizes tokens of the form
`text`
, which are used in the templates to wrap fully-qualified type names, so that they
can be extracted and replaced by imported names if possible.