I really don't mean for this blog to be T-SQL only, but recently the most work I've done that most easily translates to a blog has been T-SQL. Today's entry is on the use of the PATINDEX function when converting and manipulating data.
Simply put, PATINDEX finds the first instance of a character in a given string. This function is similar to CHARINDEX, except that you can specify a starting point. The ability to specify a starting point makes it very useful in conversions.
A recent example that I've had was data that had been entered as follows: "John Q. Public". I needed the data in three (3) different fields ("John", "Q.", "Public"). How was I to go about splitting up the data?
Getting the First Name
The first name was easy. I just used the PATINDEX to find the first instance of ' '. The T-SQL looks like this:
FirstName = Substring([fullname], 0, patindex( '% %', [fullname]))
This T-SQL says to take the [fullname] column, start at index 0 (the start), and give me everything until the index of the first space ('% %'). Very easy.
Getting the Middle Initial
Getting the Middle Initial is similar to getting the first name, except the 0 index starting point has to be replaced by the patindex of the first space. In other words, we want the index of a space that occurs after the first instance of a space. I know the text is confusing, but the code is worse.
MI =
substring (
SUBSTRING([fullname], (patindex( '% %' , [fullname]) +1 ), LEN([fullname])) -- expession firstname
, 1 --start
, patindex( '% %' , SUBSTRING([fullname], (patindex( '% %' , [fullname]) + 1 ), LEN([fullname]))) -1 ) -- length firstname
Yes, that's ugly. But essentially it's the same code as getting the first name, except now we're taking the taking the original script for finding the first name and slapping it in as the staring point for finding the Middle Initial. As long as the open and closing brackets are kept track of, it's manageable.
Finding the Last Name
In theory, we could continue building on the above code to find the last name, but the code would get so confusing as to be unmanageable. But, for fun, let's see what it would look like:
LastName = ltrim(substring(
[fullname] --expression
, (patindex( '% %' , [fullname]) + len(substring (SUBSTRING([fullname], (patindex( '% %' , [fullname]) + 1 ), LEN([fullname]))
, 1, patindex( '% %' , SUBSTRING([fullname], (patindex( '% %' , [fullname]) + 1 ), LEN([fullname]))) -1 ))) + 2 -- start
, len([fullname])
)
)
You can see the two previous example crammed into that code, but it's extremely hard to read and virtually impossible to maintain. Let's find a better way. If we think out of the box, the last name is all the characters after the LAST space in the string. So if we find the last space, we can then just use substring. The easiest way I've found to get the last space is to simple reverse the string and then use PATINDEX to find the first space. I then subtract the PATINDEX from the length of the string, and now you have your starting point.
LastName = substring(fullname, len(fullname) - patindex('% %', reverse(fullname)) + 2, patindex('% %', reverse(fullname)))
I added two (2) to the PATINDEX to compensate for the '.' and the extra space after the Middle Initial.
Anyway, that's all for this time. I hope this article has been useful. Happy coding!
42b99833-4758-48b2-b1a2-7e2ff7cec400|0|.0