Generating Request URLs¶
In a REST API, the client or application program– the kind of program you will be writing– makes an HTTP request that includes information about what kind of request it is making. Web sites are free to define whatever format they want for how the request should be formatted. This chapter covers a particularly common and particularly simple format, where the request information is encoded right in the URL. This is convenient, because if something goes wrong, we can debug by copying the URL into a browser and see what happens when it tries to visit that URL.
In this format, the URL has a standard structure:
- the base URL
- a
?
character - one or more key-value pairs, formatted as
key=value
pairs and separated by the&
character.
For example, consider the URL https://itunes.apple.com/search?term=Ann+Arbor&entity=podcast
.
Try copying that URL into a browser. It data about podcasts posted from Ann Arbor, MI. Depending on your browser,
it may put the contents into a file attachment that you have to open up to see the contents.
Let’s pull apart that URL.
- the base URL is
https://itunes.apple.com/search
- a
?
character - key=value pairs. In this case, there are two pairs. The keys are
term
andentity
.
All those parts are concatenated together to form the full URL.
Note that in the search term Ann Arbor, the space had to be “encoded” as +
. More on that below.
Encoding URL Parameters¶
Here’s another URL that has a similar format. https://www.google.com/search?q=%22violins+and+guitars%22&tbm=isch
. It’s a search on Google for images that match the string “violins and guitars”. It’s not actually based on a REST
API, because the contents that come back are meant to be displayed in a browser. But the URL has the same structure
we have been exploring above and introduces the idea of “encoding” URL parameters.
- The base URL is
https://www.google.com/search
?
- Two key=value parameters, separated by
&
q=%22violins+and+guitars%22
says that the query to search for is “violins and guitars”.tbm=isch
says to go to the tab for image search
- Two key=value parameters, separated by
Now why is "violins and guitars"
represented in the URL as %22violins+and+guitars%22
? The answer is that
some characters are not safe to include, as is, in URLs. For example, a URL path is not allowed to include the double
-quote character. It also can’t include a : or / or a space. Whenever we want to include one of those characters in
a URL, we have to encode them with other characters. A space is encoded as +
. "
is encoded as %22
.
:
would be encoded as %3A
. And so on.
Using requests.get to encode URL parameters¶
Fortunately, when you want to pass information as a URL parameter value, you don’t have to remember all the substitutions that are required to encode special characters. Instead, that capability is built into the requests module.
The get
function in the requests module takes an optional parameter called params
. If a value is specified for
that parameter, it should be a dictionary. The keys and values in that dictionary are used to append something to
the URL that is requested from the remote site.
For example, in the following, the base url is https://google.com/search. A dictionary with two parameters is
passed. Thus, the whole url is that base url, plus a question mark, “?”, plus a “q=…” and a “tbm=…” separated
by an “&”. In other words, the final url that is visited is https://www.google.com/search?q=%22violins+and+guitars%22&tbm=isch
. Actually, because dictionary keys are unordered in python, the
final url might sometimes have the encoded key-value pairs in the other order: https://www.google.com/search?tbm=isch&q=%22violins+and+guitars%22
. Fortunately, most websites that accept URL parameters in this
form will accept the key-value pairs in any order.
import requests
d = {'q': 'violins and guitars', 'tbm': 'isch'}
results = requests.get("https://google.com/search", params=d)
print(results.url)
Below are more examples of urls, outlining the base part of the url - which would be the first argument when
calling request.get()
- and the parameters - which would be written as a dictionary and passed into the params
argument when calling request.get()
.
Note
If you’re ever unsure exactly what url has been produced when calling requests.get and passing a value for params, you can access the .url attribute of the object that is returned. This will be a helpful debugging strategy. You can take that url and plug it into a browser and see what results come back! We will talk about this more in the next section, on debugging calls to requests.get()
when they don’t do exactly what you expect.
Check your understanding
-
exceptions-1: How would you request the URL
- requests.get("http://bar.com/goodstuff", '?", {'greet': 'hi there'}, '&', {'frosted':'no'})
- The ? and the & are added automatically.
- requests.get("http://bar.com/", params = {'goodstuff':'?', 'greet':'hi there', 'frosted':'no'})
- goodstuff is part of the base url, not the query params
- requests.get("http://bar.com/goodstuff", params = ['greet', 'hi', 'there', 'frosted', 'no'])
- The value of params should be a dictionary, not a list
- requests.get("http://bar.com/goodstuff", params = {'greet': 'hi there', 'frosted':'no'})
- The ? and & are added automatically, and the space in hi there is automatically encoded as %3A.
http://bar.com/goodstuff?greet=hi+there&frosted=no
using the requests module?