Follow in Twitter & Facebook


Like in Facebook

Powered By Blogger Widgets

Free Download

FREE Tools


Wednesday, June 6, 2012

Remove HTML tags from string in C#

In this article we will discuss how to remove html tags from a String using C#.Net. In my last article we have discussed how to Extract first N words from a string with C#.

Sometime we need to remove the HTML tags and desire only simple text from a String that contains inside a String.

Below is the sample code. The string contains HTML tags like below:
string htmlString = "<html><head><div> Hello World </div><br/> Br Should not come.</head><body><b>But what about this? Will this | will come? ) </b></body></html>";

HTML Code:
<asp:Button ID="btnTest" runat="server" Text="Click Here to Test" OnClick="btnTest_Click" /><br />
<asp:Label ID="lblResult" runat="server" Text=""></asp:Label>

.CS Code
protected void btnTest_Click(object sender, EventArgs e)
{
string htmlString = "<html><head><div> Hello World </div><br/> Br Should not come.</head><body><b>But what about this? Will this | will come? ) </b></body></html>";

 Regex regex = new Regex("<[^>]*>");
//Regex regex = new Regex("<(.|\n)*?>");

string withOutHTMLTagString = regex.Replace(htmlString, "");

 lblResult.Text = withOutHTMLTagString;
}

The output will come as:
"Hello World Br Should not come.But what about this? Will this | will come? )" 




0 on: "Remove HTML tags from string in C#"