I need a regular expression that will return all the links on a html page.
Links are like:
<a href="http://www.google.com">google</a>
Links may have other attributes on them like class/id's etc.
blah blah blah is here! blah blah » Close
I wanted to take all the texts that appear on my all webpages in my website into a string. Can it be possible?
For Sanjib's query the following code(originally written by Scott Mitchell in VB and transferred into C# by vulpes in C#Friends Forum) may come to help. <code> <%@ Import Namespace="System.Net" %> <script language="C#" runat="server"> void Page_Load(Object sender, EventArgs e) { // STEP 1: Create a WebClient instance WebClient objWebClient = new WebClient(); // STEP 2: Call the DownloadedData method const string strURL = "http://www.aspmessageboard.com/"; byte[] aRequestedHTML = objWebClient.DownloadData(strURL); // STEP 3: Convert the byte array into a string UTF8Encoding objUTF8 = new UTF8Encoding(); string strRequestedHTML = objUTF8.GetString(aRequestedHTML); // WE'RE DONE! - display the string lblHTMLOutput.Text = strRequestedHTML; } </script> <html> <body> <h1>Screen Scrape of www.aspmessageboard.com</h1> <p> <asp:label id="lblHTMLOutput" runat="server" /> </body> </html> </code> I have tested it and the texts in a website come to a string!
Got feedack? Found a bug? report it here.
1 answers