Class PatternMatcherInput


  • public final class PatternMatcherInput
    extends java.lang.Object
    The PatternMatcherInput class is used to preserve state across calls to the contains() methods of PatternMatcher instances. It is also used to specify that only a subregion of a string should be used as input when looking for a pattern match. All that is meant by preserving state is that the end offset of the last match is remembered, so that the next match is performed from that point where the last match left off. This offset can be accessed from the getCurrentOffset() method and can be set with the setCurrentOffset(int) method.

    You would use a PatternMatcherInput object when you want to search for more than just the first occurrence of a pattern in a string, or when you only want to search a subregion of the string for a match. An example of its most common use is:

     PatternMatcher matcher;
     PatternCompiler compiler;
     Pattern pattern;
     PatternMatcherInput input;
     MatchResult result;
    
     compiler = new Perl5Compiler();
     matcher  = new Perl5Matcher();
    
     try {
       pattern = compiler.compile(somePatternString);
     } catch(MalformedPatternException e) {
       System.out.println("Bad pattern.");
       System.out.println(e.getMessage());
       return;
     }
    
     input   = new PatternMatcherInput(someStringInput);
    
     while(matcher.contains(input, pattern)) {
       result = matcher.getMatch();  
       // Perform whatever processing on the result you want.
     }
     // Suppose we want to start searching from the beginning again with
     // a different pattern.
     // Just set the current offset to the begin offset.
     input.setCurrentOffset(input.getBeginOffset());
    
     // Second search omitted
    
     // Suppose we're done with this input, but want to search another string.
     // There's no need to create another PatternMatcherInput instance.
     // We can just use the setInput() method.
     input.setInput(aNewInputString);
    
     
    Since:
    1.0
    Version:
    ,
    See Also:
    PatternMatcher
    • Constructor Summary

      Constructors 
      Constructor Description
      PatternMatcherInput​(char[] input)
      Like calling:
      PatternMatcherInput​(char[] input, int begin, int length)
      Creates a PatternMatcherInput object, associating a region of a string (represented as a char[]) as input to be used for pattern matching by PatternMatcher objects.
      PatternMatcherInput​(java.lang.String input)
      Like calling
      PatternMatcherInput​(java.lang.String input, int begin, int length)
      Creates a PatternMatcherInput object, associating a region of a String as input to be used for pattern matching by PatternMatcher objects.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      char charAt​(int offset)
      Returns the character at a particular offset relative to the begin offset of the input.
      boolean endOfInput()
      Returns whether or not the end of the input has been reached.
      int getBeginOffset()  
      char[] getBuffer()
      Retrieves the char[] buffer to be used used as input by PatternMatcher implementations to look for matches.
      int getCurrentOffset()  
      int getEndOffset()  
      java.lang.Object getInput()
      Retrieves the original input used to initialize the PatternMatcherInput instance.
      int getMatchBeginOffset()
      Returns the offset marking the beginning of the match found by contains().
      int getMatchEndOffset()
      Returns the offset marking the end of the match found by contains().
      int length()  
      java.lang.String match()
      A convenience method returning the part of the input corresponding to the last match found by a call to a Perl5Matcher contains method.
      java.lang.String postMatch()
      A convenience method returning the part of the input occurring after the last match found by a call to a Perl5Matcher contains method.
      java.lang.String preMatch()
      A convenience method returning the part of the input occurring before the last match found by a call to a Perl5Matcher contains method.
      void setBeginOffset​(int offset)
      Sets the offset of the input that should be considered the start of the region to be considered as input by PatternMatcher methods.
      void setCurrentOffset​(int offset)
      Sets the offset of the input that should be considered the current offset where PatternMatcher methods should start looking for matches.
      void setEndOffset​(int offset)
      Sets the offset of the input that should be considered the end of the region to be considered as input by PatternMatcher methods.
      void setInput​(char[] input)
      This method is identical to calling:
      void setInput​(char[] input, int begin, int length)
      Associates a region of a string (represented as a char[]) as input to be used for pattern matching by PatternMatcher objects.
      void setInput​(java.lang.String input)
      This method is identical to calling:
      void setInput​(java.lang.String input, int begin, int length)
      Associates a region of a String as input to be used for pattern matching by PatternMatcher objects.
      void setMatchOffsets​(int matchBeginOffset, int matchEndOffset)
      This method is intended for use by PatternMatcher implementations.
      java.lang.String substring​(int beginOffset)
      Returns a new string that is a substring of the PatternMatcherInput instance.
      java.lang.String substring​(int beginOffset, int endOffset)
      Returns a new string that is a substring of the PatternMatcherInput instance.
      java.lang.String toString()
      Returns the string representation of the input, where the input is considered to start from the begin offset and end at the end offset.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
    • Constructor Detail

      • PatternMatcherInput

        public PatternMatcherInput​(java.lang.String input,
                                   int begin,
                                   int length)
        Creates a PatternMatcherInput object, associating a region of a String as input to be used for pattern matching by PatternMatcher objects. A copy of the string is not made, therefore you should not modify the string unless you know what you are doing. The current offset of the PatternMatcherInput is set to the begin offset of the region.

        Parameters:
        input - The input to associate with the PatternMatcherInput.
        begin - The offset into the char[] to use as the beginning of the input.
        length - The length of the reegion starting from the begin offset to use as the input for pattern matching purposes.
      • PatternMatcherInput

        public PatternMatcherInput​(java.lang.String input)
        Like calling
         PatternMatcherInput(input, 0, input.length());
         

        Parameters:
        input - The input to associate with the PatternMatcherInput.
      • PatternMatcherInput

        public PatternMatcherInput​(char[] input,
                                   int begin,
                                   int length)
        Creates a PatternMatcherInput object, associating a region of a string (represented as a char[]) as input to be used for pattern matching by PatternMatcher objects. A copy of the string is not made, therefore you should not modify the string unless you know what you are doing. The current offset of the PatternMatcherInput is set to the begin offset of the region.

        Parameters:
        input - The input to associate with the PatternMatcherInput.
        begin - The offset into the char[] to use as the beginning of the input.
        length - The length of the reegion starting from the begin offset to use as the input for pattern matching purposes.
      • PatternMatcherInput

        public PatternMatcherInput​(char[] input)
        Like calling:
         PatternMatcherInput(input, 0, input.length);
         

        Parameters:
        input - The input to associate with the PatternMatcherInput.
    • Method Detail

      • length

        public int length()
        Returns:
        The length of the region to be considered input for pattern matching purposes. Essentially this is then end offset minus the begin offset.
      • setInput

        public void setInput​(java.lang.String input,
                             int begin,
                             int length)
        Associates a region of a String as input to be used for pattern matching by PatternMatcher objects. The current offset of the PatternMatcherInput is set to the begin offset of the region.

        Parameters:
        input - The input to associate with the PatternMatcherInput.
        begin - The offset into the String to use as the beginning of the input.
        length - The length of the reegion starting from the begin offset to use as the input for pattern matching purposes.
      • setInput

        public void setInput​(java.lang.String input)
        This method is identical to calling:
         setInput(input, 0, input.length());
         

        Parameters:
        input - The input to associate with the PatternMatcherInput.
      • setInput

        public void setInput​(char[] input,
                             int begin,
                             int length)
        Associates a region of a string (represented as a char[]) as input to be used for pattern matching by PatternMatcher objects. A copy of the string is not made, therefore you should not modify the string unless you know what you are doing. The current offset of the PatternMatcherInput is set to the begin offset of the region.

        Parameters:
        input - The input to associate with the PatternMatcherInput.
        begin - The offset into the char[] to use as the beginning of the input.
        length - The length of the reegion starting from the begin offset to use as the input for pattern matching purposes.
      • setInput

        public void setInput​(char[] input)
        This method is identical to calling:
         setInput(input, 0, input.length);
         

        Parameters:
        input - The input to associate with the PatternMatcherInput.
      • charAt

        public char charAt​(int offset)
        Returns the character at a particular offset relative to the begin offset of the input.

        Parameters:
        offset - The offset at which to fetch a character (relative to the beginning offset.
        Returns:
        The character at a particular offset.
        Throws:
        java.lang.ArrayIndexOutOfBoundsException - If the offset does not occur within the bounds of the input.
      • substring

        public java.lang.String substring​(int beginOffset,
                                          int endOffset)
        Returns a new string that is a substring of the PatternMatcherInput instance. The substring begins at the specified beginOffset relative to the begin offset and extends to the specified endOffset - 1 relative to the begin offset of the PatternMatcherInput instance.

        Parameters:
        beginOffset - The offset relative to the begin offset of the PatternMatcherInput at which to start the substring (inclusive).
        endOffset - The offset relative to the begin offset of the PatternMatcherInput at which to end the substring (exclusive).
        Returns:
        The specified substring.
        Throws:
        java.lang.ArrayIndexOutOfBoundsException - If one of the offsets does not occur within the bounds of the input.
      • substring

        public java.lang.String substring​(int beginOffset)
        Returns a new string that is a substring of the PatternMatcherInput instance. The substring begins at the specified beginOffset relative to the begin offset and extends to the end offset of the PatternMatcherInput.

        Parameters:
        beginOffset - The offset relative to the begin offset of the PatternMatcherInput at which to start the substring.
        Returns:
        The specified substring.
        Throws:
        java.lang.ArrayIndexOutOfBoundsException - If the offset does not occur within the bounds of the input.
      • getInput

        public java.lang.Object getInput()
        Retrieves the original input used to initialize the PatternMatcherInput instance. If a String was used, the String instance will be returned. If a char[] was used, a char instance will be returned. This violates data encapsulation and hiding principles, but it is a great convenience for the programmer.

        Returns:
        The String or char[] input used to initialize the PatternMatcherInput instance.
      • getBuffer

        public char[] getBuffer()
        Retrieves the char[] buffer to be used used as input by PatternMatcher implementations to look for matches. This array should be treated as read only by the programmer.

        Returns:
        The char[] buffer to be used as input by PatternMatcher implementations.
      • endOfInput

        public boolean endOfInput()
        Returns whether or not the end of the input has been reached.

        Returns:
        True if the current offset is greater than or equal to the end offset.
      • getBeginOffset

        public int getBeginOffset()
        Returns:
        The offset of the input that should be considered the start of the region to be considered as input by PatternMatcher methods.
      • getEndOffset

        public int getEndOffset()
        Returns:
        The offset of the input that should be considered the end of the region to be considered as input by PatternMatcher methods. This offset is actually 1 plus the last offset that is part of the input region.
      • getCurrentOffset

        public int getCurrentOffset()
        Returns:
        The offset of the input that should be considered the current offset where PatternMatcher methods should start looking for matches.
      • setBeginOffset

        public void setBeginOffset​(int offset)
        Sets the offset of the input that should be considered the start of the region to be considered as input by PatternMatcher methods. In other words, everything before this offset is ignored by a PatternMatcher.

        Parameters:
        offset - The offset to use as the beginning of the input.
      • setEndOffset

        public void setEndOffset​(int offset)
        Sets the offset of the input that should be considered the end of the region to be considered as input by PatternMatcher methods. This offset is actually 1 plus the last offset that is part of the input region.

        Parameters:
        offset - The offset to use as the end of the input.
      • setCurrentOffset

        public void setCurrentOffset​(int offset)
        Sets the offset of the input that should be considered the current offset where PatternMatcher methods should start looking for matches. Also resets all match offset information to -1. By calling this method, you invalidate all previous match information. Therefore a PatternMatcher implementation must call this method before setting match offset information.

        Parameters:
        offset - The offset to use as the current offset.
      • toString

        public java.lang.String toString()
        Returns the string representation of the input, where the input is considered to start from the begin offset and end at the end offset.

        Overrides:
        toString in class java.lang.Object
        Returns:
        The string representation of the input.
      • preMatch

        public java.lang.String preMatch()
        A convenience method returning the part of the input occurring before the last match found by a call to a Perl5Matcher contains method.

        Returns:
        The input preceeding a match.
      • postMatch

        public java.lang.String postMatch()
        A convenience method returning the part of the input occurring after the last match found by a call to a Perl5Matcher contains method.

        Returns:
        The input succeeding a contains() match.
      • match

        public java.lang.String match()
        A convenience method returning the part of the input corresponding to the last match found by a call to a Perl5Matcher contains method. The method is not called getMatch() so as not to confuse it with Perl5Matcher's getMatch() which returns a MatchResult instance and also for consistency with preMatch() and postMatch().

        Returns:
        The input consisting of the match found by contains().
      • setMatchOffsets

        public void setMatchOffsets​(int matchBeginOffset,
                                    int matchEndOffset)
        This method is intended for use by PatternMatcher implementations. It is necessary to record the location of the previous match so that consecutive contains() matches involving null string matches are properly handled. If you are not implementing a PatternMatcher, forget this method exists. If you use it outside of its intended context, you will only disrupt the stored state.

        As a note, the preMatch(), postMatch(), and match() methods are provided as conveniences because PatternMatcherInput must store match offset information to completely preserve state for consecutive PatternMatcher contains() matches.

        Parameters:
        matchBeginOffset - The begin offset of a match found by contains().
        matchEndOffset - The end offset of a match found by contains().
      • getMatchBeginOffset

        public int getMatchBeginOffset()
        Returns the offset marking the beginning of the match found by contains().

        Returns:
        The begin offset of a contains() match.
      • getMatchEndOffset

        public int getMatchEndOffset()
        Returns the offset marking the end of the match found by contains().

        Returns:
        The end offset of a contains() match.