Extract metadata and links from HTML page using HtmlAgilityPack

In this tutorial, i will show you how to extract metadata and links from HTML page using HtmlAgilityPack

Implementation

LinkItem

Represents a link :

  • Href  : URL of the link
  • Text : InnerText of <a> tag

LinkExtractor

Extract LinkItems from the htmlData String parameter

HtmlMetadata

Represents HTML page metadata :

  • Title  : HTML Document title
  • Description : Description of web page
  • Keywords : keywords (for search engines)
  • Author : author of the page

HtmlMetadataProvider

Usage

Category