scrape

A simple, higher level interface for Go web scraping.

When scraping with Go, I find myself redefining tree traversal and other utility functions.

This package is a place to put some simple tools which build on top of the Go HTML parsing library.

For the full interface check out the godoc

Sample

Scrape defines traversal functions like Find and FindAll while attempting to be generic. It also defines convenience functions such as Attr and Text.

// Parse the pageroot, err:=html.Parse(resp.Body) iferr!=nil{// handle error } // Search for the titletitle, ok:=scrape.Find(root, scrape.ByTag(atom.Title)) ifok{// Print the titlefmt.Println(scrape.Text(title)) }

A full example: Scraping Hacker News

package main import ( "fmt""net/http""github.com/yhat/scrape""golang.org/x/net/html""golang.org/x/net/html/atom" ) funcmain(){// request and parse the front pageresp, err:=http.Get("https://news.ycombinator.com/") iferr!=nil{panic(err) } root, err:=html.Parse(resp.Body) iferr!=nil{panic(err) } // define a matchermatcher:=func(n*html.Node) bool{// must check for nil valuesifn.DataAtom==atom.A&&n.Parent!=nil&&n.Parent.Parent!=nil{returnscrape.Attr(n.Parent.Parent, "class") =="athing" } returnfalse } // grab all articles and print themarticles:=scrape.FindAll(root, matcher) fori, article:=rangearticles{fmt.Printf("%2d %s (%s)\n", i, scrape.Text(article), scrape.Attr(article, "href")) } }

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
example		example
LICENSE		LICENSE
README.md		README.md
scrape.go		scrape.go
scrape_test.go		scrape_test.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

scrape

Sample

A full example: Scraping Hacker News

About

Uh oh!

Releases

Packages

Contributors 5

Uh oh!

Languages

License

yhat/scrape

Folders and files

Latest commit

History

Repository files navigation

scrape

Sample

A full example: Scraping Hacker News

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Uh oh!

Languages

Packages