You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
pyquery/docs/scrap.rst

935 B

Scraping

PyQuery is able to load an html document from a url:

>>> pq(url=your_url)
[<html>]

By default it uses python's urllib.

If requests is installed then it will use it. This allow you to use most of requests parameters:

>>> pq(url=your_url, headers={'user-agent': 'pyquery'})
[<html>]

>>> pq(url=your_url, data={'q': 'foo'}, method='post', verify=True)
[<html>]

Timeout

The default timeout is 60 seconds, you can change it by setting the timeout parameter which is forwarded to the underlying urllib or requests library.

Session

When using the requests library you can instantiate a Session object which keeps state between http calls (for example - to keep cookies). You can set the session parameter to use this session object.