iOS Programming: The Big Nerd Ranch Guide, 3/e (Big Nerd Ranch Guides) (86 page)

Read iOS Programming: The Big Nerd Ranch Guide, 3/e (Big Nerd Ranch Guides) Online

Authors: Aaron Hillegass,Joe Conway

Tags: #COM051370, #Big Nerd Ranch Guides, #iPhone / iPad Programming

BOOK: iOS Programming: The Big Nerd Ranch Guide, 3/e (Big Nerd Ranch Guides)
10.83Mb size Format: txt, pdf, ePub
NSRegularExpression

Currently, the table of
RSSItem
s shows each item’s title, and this title is an
NSString
that consists of the name of the subforum, the title of the post, and the name of the author:

 
General Book Discussions :: BNR Books :: Reply by Everyone
 

It would be easier to browse posts if the table showed just the titles of the posts. To accomplish this, we need to parse the full item title, grab the string between the set of double colons, and set the
title
of each
RSSItem
to be just this string.

 

In
RSSChannel.h
, declare a new method that will eventually do these things.

 
@interface RSSChannel : NSObject
{
    NSMutableString *currentString;
}
@property (nonatomic, weak) id parentParserDelegate;
@property (nonatomic, strong) NSString *title;
@property (nonatomic, strong) NSString *infoString;
@property (nonatomic, readonly, strong) NSMutableArray *items;
- (void)trimItemTitles;
@end

In
RSSChannel.m
, send this message in
parser:didEndElement:namespaceURI:qualifiedName:
when the channel gives up control of the parser.

 
- (void)parser:(NSXMLParser *)parser
 didEndElement:(NSString *)elementName
  namespaceURI:(NSString *)namespaceURI
 qualifiedName:(NSString *)qName
{
    currentString = nil;
    if ([elementName isEqual:@"channel"])
{
        [parser setDelegate:parentParserDelegate];
        [self trimItemTitles];
    }
}
 

We could loop through all item titles and write code that parses each string so that only the title of the post remains, but we can achieve the same result using a regular expression. First, let’s take a closer look at regular expressions and how they work in iOS.

 

Regular expressions are a powerful way of searching text. When you create a regular expression, you devise a
pattern string
that represents the pattern you want to find in some other string. So, the purpose of a regular expression is to find substrings within a larger string. You apply the regular expression to the searched string, and if the pattern is found, then the searched string is said to be a
match
.

 

In iOS, a regular expression is represented by an instance of
NSRegularExpression
. An
NSRegularExpression
is initialized with a pattern string and can be applied to any instance of
NSString
. This is done with the
NSRegularExpression
instance method
matchesInString:options:range:
. You pass the string to be searched, and it returns an array of matches. Each match is represented by an instance of the class
NSTextCheckingResult
. (If there are no matches, then an empty array is returned.)

 

Every
NSTextCheckingResult
has a
range
property that identifies the location and length of the match in the searched string. The
range
property is an
NSRange
structure. A range has a location (the index of the character where the match begins) and a length (the number of characters in the match).

 

Figure 26.8
shows some examples of strings matched to the pattern string

This is a pattern

and the matched range. When a string has more than one match, there are multiple instances of
NSTextCheckingResult
in the array returned from
matchesInString:options:range:
.

 

Figure 26.8  Range of matched strings

 
 

For practice with these classes, let’s create an
NSRegularExpression
that looks for
RSSItem
s that contain the word
Author
in the title. This will get us all of the original posts; replies to posts contain the word
Reply
in their titles instead of
Author
. In
RSSChannel.m
, enter the following code for
trimItemTitles
.

 
- (void)trimItemTitles
{
    // Create a regular expression with the pattern: Author
    NSRegularExpression *reg =
                    [[NSRegularExpression alloc] initWithPattern:@"Author"
                                                         options:0
                                                           error:nil];
    // Loop through every title of the items in channel
    for (RSSItem *i in items) {
        NSString *itemTitle = [i title];
        // Find matches in the title string. The range
        // argument specifies how much of the title to search;
        // in this case, all of it.
        NSArray *matches = [reg matchesInString:itemTitle
                                        options:0
                                          range:NSMakeRange(0, [itemTitle length])];
        // If there was a match...
        if ([matches count] > 0) {
            // Print the location of the match in the string and the string
            NSTextCheckingResult *result = [matches objectAtIndex:0];
            NSRange r = [result range];
            NSLog(@"Match at {%d, %d} for %@!", r.location, r.length, itemTitle);
        }
    }
}

In this code, you first create the expression and initialize it with the pattern you’re seeking. Then, for each
RSSItem
, you send the
matchInString:options:range:
message to the regular expression passing the item’s title as the string to search. Finally, for each title, we only expect one match, so we get the first
NSTextCheckingResult
in the
matches
array, get its range, and log it to the console.

 

Build and run the application. Check the console output after all of the items are pulled down from the server. You should see only items for original posts.

 
Constructing a pattern string

In the regular expression you just created, the pattern consisted solely of a literal string. That’s not very powerful or flexible. But we can combine the literal strings with symbols that have special meanings to express more complex patterns. These special symbols are called
metacharacters
and
operators
.

 

For instance, the dot character (.) is a metacharacter that means

any character.

Thus, a pattern string
T.e
would match
The
,
Tie
, or any other string that has a
T
followed by any character followed directly by an
e
.

 

Operators in a regular expression typically follow a literal character or metacharacter and modify how the character is to be matched. For instance, the asterisk (
*
) operator means,

See the character just before me? If you see that character, that’s a match. If you see it repeated multiple times, that’s a match. If you see nothing (the character occurs zero times), that’s a match, too. But if you see any other character, the match fails.

 

Thus, if the pattern string were
Bo*o
, many strings would match this regular expression:
Boo
,
Bo
,
Booo
,
Booooooo
, etc. But
Bio
and
Boot
would fail.

 

A pattern string can contain any number of literal characters, metacharacters, and operators. Combining them gives you an enormous amount of matching power. You can see the full list of metacharacters and operators in the documentation for
NSRegularExpression
. Here are some examples of regular expressions and strings that would result in a match:

 
Pattern      Matches
------------------------------------------------
Joh?n        Jon, John
Mike*        Mike, Mik, Mikee, Mikeee, etc.
R|Bob        Rob, Bob
 

Our original problem (extracting the middle portion of the item’s title) may not seem suited for regular expressions and matching at first, but we can solve it easily using the same classes.

 

First, we need to create a regular expression that is a wild card search that returns all item titles. Take a look at the full titles of the
RSSItem
s. They all have the same format:

 
(Subforum) :: (Title) :: (Author)
 

In trying to construct a pattern string to match this format, we can’t use just a literal string as the pattern. Instead, we can use
.*
for our search. This combination of the
.
metacharacter and the
*
operator means

match anything, no matter how long it is.

Thus, our pattern to match the title of every
RSSItem
will look like this:

 
.* :: .* :: .*

The first
.*
is the name of the subforum, the second is the post title, and the third is the author’s name. These are separated by the literal string
::
. (Don’t forget the surrounding whitespaces!) In
RSSChannel.m
, change the regular expression in the
trimItemTitles
method to have this pattern.

 
NSRegularExpression *reg =
                
[[NSRegularExpression alloc] initWithPattern:@"Author"
                                                    
options:0
                                                      
error:nil];
NSRegularExpression *reg =
                [[NSRegularExpression alloc] initWithPattern:@".* :: .* :: .*"
                                                     options:0
                                                       error:nil];
 

Build and run the application. By design, every
RSSItem
’s full title matches this pattern, so every title is printed to the console. This is a good start. Next we’ll use a
capture group
to extract the title of the post (the string in the middle of the two
::
) from the full title. Using a capture group in a regular expression allows you to access a specific substring of a match. Basically, when a pattern string has one or more capture groups, each
NSTextCheckingResult
that represents a match will contain an
NSRange
structure for each capture group in the pattern.

 

Capture groups are formed by putting parentheses around a term in the regular expression pattern. Thus, to get our post title, we will use the following regular expression:

 
.* :: (.*) :: .*
 

Update
trimItemTitles
in
RSSChannel.m
to capture the string between the double colons.

 
    
NSRegularExpression *reg =
                    
[[NSRegularExpression alloc] initWithPattern:@".* :: .* :: .*"
                                                        
options:0
                                                          
error:nil];
    
NSRegularExpression *reg =
                    [[NSRegularExpression alloc] initWithPattern:@".* :: (.*) :: .*"
                                                         options:0
                                                           error:nil];
 

Now instead of having one range, each
NSTextCheckingResult
that is returned will have an array of ranges. The first range is always the range of the overall match. The next range is the first capture group, the next is the second capture group, and so on. You can get the capture group range by sending the message
rangeAtIndex:
to the
NSTextCheckingResult
. In our regular expression, there is one capture group, so the second range is the one we’re looking for. You can then pass this range in
NSString
’s
substringWithRange:
method to get back an
NSString
defined by the range.

 

In
trimItemTitles
, pull out the range from the
NSTextCheckingResult
, get the associated string (the post title), and then set it as the item’s title.

 
if ([matches count] > 0) {
    NSTextCheckingResult *result = [matches objectAtIndex:0];
    NSRange r = [result range];
    NSLog(@"Match at {%d, %d} for %@!", r.location, r.length, itemTitle);
    // One capture group, so two ranges, let's verify
    if ([result numberOfRanges] == 2) {
        // Pull out the 2nd range, which will be the capture group
        NSRange r = [result rangeAtIndex:1];
        // Set the title of the item to the string within the capture group
        [i setTitle:[itemTitle substringWithRange:r]];
    }
}
 

Build and run the application. After the connection finishes loading, reveal the master table view. The rows will now display just the titles of the posts.

 

You can become more comfortable with regular expressions by playing with
Xcode
’s find interface. Hit Command-F to bring up the
Find Bar
. Then click on the magnifying glass icon next to the search field and select
Show Find Options
, as shown in
Figure 26.9
. Change the
Style
pop-up menu to
Regular Expression
.

 

Figure 26.9  Regular expression search in Xcode

 

Now you can experiment with applying different regular expressions to your code files.

 

Regular expressions are a complex topic, and there’s a lot more to them than what we’ve used here. Visit the documentation for
NSRegularExpression
to learn more about using regular expressions in iOS.

 

Other books

Swoon at Your Own Risk by Sydney Salter
Secrets by Lynn Crandall
Savage Rhythm by Chloe Cox
Justice by Bailey Bradford
Legion by Dan Abnett