C# - HowTo Parse a URL

27. June 2011 08:00

 

Something that seems quite easy to do is to parse a url. This is actually very easy in c#. However if you search on google you will see all sorts of solutions to it. Using regular expressions and various other ways to do it. Most of which I have always found really ugly. If your going for this method you really have to think about the fact that it has been a problem now for over 20 years and there must be a more common solution.

 

Some of these extreme methods may include something like this.

 

Protected Function ExtractDomainFromURL(ByVal sURL As String) As String
	Dim rg As New Regex("://(?<host>([a-z\d][-a-z\d]*[a-z\d]\.)*[a-z][-a-z\d]+[a-z])")
 
	If rg.IsMatch(sURL) Then
        	Return rg.Match(sURL).Result("${host}")
	Else
		Return String.Empty
	End If
End Function

 

Something like that is may work but. Can you read it again in six months time?

What about the path? What about correctly decoding the path? What about the paramaters?

 

You can do this in c# by adding a reference to System.Web and using the following code.

 

class Program
{
    static void Main(string[] args)
    {
        Uri tmp = new Uri("http://www.google.co.uk/search?hl=en&q=parsing+a+url+in+c%23&aq=f&aqi=g1g-j9&aql=&oq=");

        Console.WriteLine("Protocol: {0}", tmp.Scheme);
        Console.WriteLine("Host: {0}", tmp.Host);
        Console.WriteLine("Path: {0}", HttpUtility.UrlDecode(tmp.AbsolutePath));
        Console.WriteLine("Query: {0}", tmp.Query);
        NameValueCollection Parms = HttpUtility.ParseQueryString(tmp.Query);
        Console.WriteLine("Parms: {0}", Parms.Count);
        foreach (string x in Parms.AllKeys)
            Console.WriteLine("\tParm: {0} = {1}", x, Parms[x]);

        Console.ReadLine();
    }
}

 

The program will produce the following output. With a correctly decoded url and access to the query string.

 

Protocol: http
Host: www.google.co.uk
Path: /search
Query: ?hl=en&q=parsing+a+url+in+c%23&aq=f&aqi=g1g-j9&aql=&oq=
Parms: 6
        Parm: hl = en
        Parm: q = parsing a url in c#
        Parm: aq = f
        Parm: aqi = g1g-j9
        Parm: aql =
        Parm: oq =

 

Enjoy Laughing

E-mail Kick it! DZone it! del.icio.us Permalink