27. June 2011 08:00
Something that seems quite easy to do is to parse a url. This is actually very easy in c#. However if you search on google you will see all sorts of solutions to it. Using regular expressions and various other ways to do it. Most of which I have always found really ugly. If your going for this method you really have to think about the fact that it has been a problem now for over 20 years and there must be a more common solution.
Some of these extreme methods may include something like this.
Protected Function ExtractDomainFromURL(ByVal sURL As String) As String
Dim rg As New Regex("://(?<host>([a-z\d][-a-z\d]*[a-z\d]\.)*[a-z][-a-z\d]+[a-z])")
If rg.IsMatch(sURL) Then
Return rg.Match(sURL).Result("${host}")
Else
Return String.Empty
End If
End Function
Something like that is may work but. Can you read it again in six months time?
What about the path? What about correctly decoding the path? What about the paramaters?
You can do this in c# by adding a reference to System.Web and using the following code.
class Program
{
static void Main(string[] args)
{
Uri tmp = new Uri("http://www.google.co.uk/search?hl=en&q=parsing+a+url+in+c%23&aq=f&aqi=g1g-j9&aql=&oq=");
Console.WriteLine("Protocol: {0}", tmp.Scheme);
Console.WriteLine("Host: {0}", tmp.Host);
Console.WriteLine("Path: {0}", HttpUtility.UrlDecode(tmp.AbsolutePath));
Console.WriteLine("Query: {0}", tmp.Query);
NameValueCollection Parms = HttpUtility.ParseQueryString(tmp.Query);
Console.WriteLine("Parms: {0}", Parms.Count);
foreach (string x in Parms.AllKeys)
Console.WriteLine("\tParm: {0} = {1}", x, Parms[x]);
Console.ReadLine();
}
}
The program will produce the following output. With a correctly decoded url and access to the query string.
Protocol: http
Host: www.google.co.uk
Path: /search
Query: ?hl=en&q=parsing+a+url+in+c%23&aq=f&aqi=g1g-j9&aql=&oq=
Parms: 6
Parm: hl = en
Parm: q = parsing a url in c#
Parm: aq = f
Parm: aqi = g1g-j9
Parm: aql =
Parm: oq =
Enjoy 
cd8ccd49-37c8-48c0-a2da-6b91d83de375|1|5.0