Implementation and Application of HTML Parser Based on Swing
HTML page parsing is the foundation of all work.By analyzing the tags and classification of hyperlinks that HTML parsers are interested in,an HTML parser has been implemented based on Java's Swing package to ex-tract hyperlinks and anchor text from HTML documents;Then,the HTML parser is applied to the development of the search engine Spider for multimedia information retrieval systems.By setting several seed websites and selecting appropriate search algorithms,web pages containing audio,video,and Flash animations are filtered out and stored in a database.