用C#和正则表达式截取html代码-技术开发专区

用C#和正则表达式截取html代码

作者：IT168技术频道整理编辑：胡铭娅 2009-07-31 09:26 来源：IT168�

　二、官方样例

　用C#和正则表达式如何对上例的代码进行截取呢，我们来看看微软官方发布的样例。

　示例

　System.Text.RegularExpressions.RegexOptions 。

　class TestRegularExpressions

　{

　static void Main()

　{

　string[] sentences =

　{

　"cow over the moon",

　"Betsy the Cow",

　"cowering in the corner",

　"no match here"

　};

　string sPattern = "cow";

　foreach (string s in sentences)

　{

　System.Console.Write("{0,24}", s);

　if (System.Text.RegularExpressions.Regex.IsMatch(s, sPattern, System.Text.RegularExpressions.RegexOptions.IgnoreCase))

　{

　System.Console.WriteLine(" (match for '{0}' found)", sPattern);

　}

　else

　{

　System.Console.WriteLine();

　}

　// Keep the console window open in debug mode.

　System.Console.WriteLine("Press any key to exit.");

　System.Console.ReadKey();

　}

　/* Output:

　cow over the moon (match for 'cow' found)

　Betsy the Cow (match for 'cow' found)

　cowering in the corner (match for 'cow' found)

　no match here

　以上代码是一个控制台应用程序，用于对数组中的字符串执行简单的不区分大小写的搜索。给定要搜索的字符串和包含搜索模式的字符串后，静态方法 Regex.IsMatch 将执行搜索。使用第三个参数指示忽略大小写。

　实际可以使用以下代码来匹配。

Regex regex = new Regex("<div class=\"dxx_of\" id=\".+/>.+(?<htmlCode>.+).+<div class=\"c\"></div>");

　MatchCollection matchs = regex.Matches(resultHtml);

　if(maths.Count>0)

　strig html = matchs[0].Groups["htmlCode"].Value;

　　但是正则的"."只能匹配不含\n的任何字符，可是HTML代码中有很多\r\n。

关注我们