Whenever it comes parse the content of an external web page, cURL is a sweet tool for that.
A simple request from command line:
$ curl http://www.chrisrolle.com/blog
returns some HTML:
<!DOCTYPE html>
<html lang='en'>
<head>
<title>Christian Rolle</title>
<!--
some HTML
-->
</html>
But cURL can do so much more.
cURL request header
The request header can be set with option -H:
$ curl -H "Accept: application/json" \
http://www.chrisrolle.com/blog
If the API can respond to the requested JSON, it could like:
{
"articles":[
{ "title":"ActiveRecord equality - explained" },
{ "title":"Benchmark vs. question mark" }
]
}
For some standard HTTP header fields such as User-Agent, Cookie and Host, there is also another way to set them:
- -A (or –user-agent): for the “User-Agent” field
- -b (or –cookie): for the “Cookie” field
- -e (or –referer): for the “Referer” field
manipulating the User-Agent:
curl -A "Mozilla/5.0 (iPhone; U; CPU iPhone OS 4_3_3 like Mac OS X; en-us) AppleWebKit/533.17.9 (KHTML, like Gecko) Version/5.0.2 Mobile/8J2 Safari/6533.18.5" http://www.chrisrolle.com
or the referer URL:
curl --referer http://google.com http://www.chrisrolle.com
and even the cookie:
curl --cookie Sw33t-C00k1e-5ess10n-1D http://www.chrisrolle.com
cURL with parameters
Sometimes additional parameters have to be sent with option -d:
$ curl -X GET -H "Accept: application/json" -d "search=question" http://www.chrisrolle.com/blog
and the limited response:
{
"articles":[
{ "title":"Benchmark vs. question mark" }
]
}
cURL response header
The response header can be analyzed with option -I (or –head):
$ curl --head http://www.chrisrolle.com
HTTP/1.1 200 OK
Server: Cowboy
Date: Sun, 5 Jun 2016 10:44:35 GMT
Connection: keep-alive
X-Frame-Options: SAMEORIGIN
X-Xss-Protection: 1; mode=block
X-Content-Type-Options: nosniff
Strict-Transport-Security: max-age=0; includeSubDomains
Content-Type: text/html; charset=utf-8
Etag: W/"f0f8147b4d242b90067f4bfd2873b836"
Cache-Control: max-age=0, private, must-revalidate
Set-Cookie: request_method=HEAD; path=/
Set-Cookie: _session=xxx; path=/; HttpOnly
X-Request-Id: e8fa64a4-36c8-9111-8322-126a0d236bc9
X-Runtime: 0.163079
Via: 1.1 vegur
cURL cookie session
For session reasons it might be useful to store the cookie into a file:
curl --cookie-jar cookie.txt http://www.chrisrolle.com
for later usage:
curl --cookie cookie.txt http://www.chrisrolle.com
cURL form substitute
A form submit also can be faked. POST as the standard HTTP verb for forms has to be set explicitly and the appropriate parameters has to be sent along likewise:
curl -X POST -d 'id=123&person[name]=Christian' http://example.com/form
For multipart forms there is the option-F (or –form):
curl --form upload=@people.csv \
--form press=submit http://example.com/form
More options are described in the manpage.