JSOUP: -
HTML Parser
JSOUP
library provide features to parse HTML pages.
Please refer
https://jsoup.org/ for api details.
Jsoup
implements the WHATWG HTML5 specification, and parses HTML to the same DOM as
modern browsers do.
1. scrape and parse HTML from a URL, file, or
string
2. find and extract data, using DOM traversal or
CSS selectors
3. manipulate the HTML elements, attributes, and
text
4. clean user-submitted content against a safe
white-list, to prevent XSS attacks
5. output tidy HTML
Example:-
package com.test.main;
import java.io.IOException;
import org.jsoup.Jsoup;
import
org.jsoup.nodes.Document;
import
org.jsoup.select.Elements;
public class Test1 {
public static void main(String[] args) throws IOException {
Document
doc = Jsoup.connect("https://jsoup.org/").get();
Elements
div = doc.select("div");
System.out.println(div.html());
}
}
PLease explain in more details
ReplyDelete