Http is stateless.
Each http request is independent of all others.
The server doesn't view its interactions as a bunch of phone calls;
rather each http-request is a post-card.
There is no inherent way of knowing, when reading a postcard,
what previous postcards it may be referring to.
One solution is that the person sending the postcard (the client)
includes a “re-cap”,
reminding the recipient (the server) who they are,
as well as anything else that they've already agreed upon.
That's a cookie!
Of course, the recipient needs be wary of whatever the
sender claims was previously agreed-upon.
We'll (help) address that issue with sessions, next lecture.
In particular, we'll use sessions that are based on cookies.
Summary: Cookies.
Concepts:
A way of communicating state, between two different http requests/visits.
The server wants to remember something for a future visit, so it
asks the client to remember it for them, and to provide that info
on all future requests.1
This is a cooperation between client and server.
(In particular, client can choose to ignore requests to cooperate.)
Technicalities:
server code calls setcookie,
to ask a client to re-provide that information on their future visits.
Cookies are a name/value pair, along with a few other pieces of
information (domain, expiration date, and some others).
It's up to the client (browser) to include those name/value pairs on future visits.
server code (php) can look up in the array $_COOKIE,
to find any values the client included with the current page-request.
To delete a cookie, re-set it with an expiration-date that is in the past.
Pragmatics/Gotchas:
Be sure to call setcookiebefore printing anything.
When you setcookie, those cookies won't be sent back to
you until the next visit (they're not instantly inserted into $_COOKIE).
Cookie Details
Cookies
(Note: we will not use cookies directly, in this class;
we'll use sessions. But sessions are built on top of cookies,
knowing how cookies work are a requirement for understanding sessions.)
Cookies are a method for the server to store information about the user --
on the user's machine -- so that the server can remember
the user over the course of the visit or through several
web visits.
A cookie is just a key=>value entry
(just like a php array entry, or javascript object property,
or a java.util.Map entry, or json line).
It lives on the client machine,
and is provided by the browser when a request is made.
A PHP script can access
cookies by looking in the super-global $_COOKIE.
Example:
If a php script tells a browser to
setcookie('hamburger-price', 2.70) today,
and then the browser visits that same site next Tuesday,
the php can evaluate $_COOKIE['hamburger-price']
and get back 2.7.
In addition to the name/value pair, a cookie also has:
an expiration date, a domain, and a directory path.
Whenever a browser requests, a page, it attaches all cookie key/value pairs,
and sends those to the server — if the domain and path match.
So if you visit a page that performs a
setcookie('hamburger-price', 2.70, '/~ibarland', 'php.radford.edu'),
and later you visit
https://php.radford.edu/~ibarland/someDir/somefile.php,
your browser will attach the hamburger-price cookie.
But if you visit
https://php.radford.edu/~jcdavis/someDir/somefile.php,
the browser won't attach the cookie.
Similarly if you visit
https://rucs.radford.edu/~ibarland/someDir/somefile.php,
the browser won't attach the cookie either.
(Note that setting a cookie for '.radford.edu' (note the initial .),
this refers to all machines within the radford.edu domain.)
Don't make two different cookies with the same path, but different domains
(one a superset of the other).
Different browsers may choose differently, which one gets sent.
(AFAICT: the more specific path wins; but for same paths with
two applicable domains, the first cookie set made wins.)
It's not exactly advisable to make two different cookies with
the same name but different paths either, though that may not be enforceable,
e.g./~ibarland and /~jcdavis may each contain
different scripts that happen to use the cookie “monster”.
Note also that if I set a cookie's path to be /,
then this is potentially a security flaw:
if somebody visits my script and I set a cookie secret-code-word
with server&path being radford.edu & /,
and then that person visits (say) radford.edu/~jcdavis,
his script will be sent the cookie that I had saved.
(He's a sly one, that jcdavis!)
Upshot: don't set the cookie's server&path to include URLs that others control.
You can look at the cookies on your machine itself:
e.g. Chrome > Preferences… &to; Show advanced settings… Privacy &to;
Content Settings… &to; All cookies and site data…,
and then use the search-box for (say) radford or amazon
The full parameter-list lets you set all the info seen above:
setcookie(name, value, expiration, path, domain, secure, httponly);
name — (required) cookie name
value — (required) limited to 4KB of data, string
expiration — (optional) used to
set the expiration-date for the cookie, as a unix timestamp2.
Often, one takes the current-time plus some amount:
setcookie('my-c-name','some-value',time() + 3600);
Setting the expiration-time as 0 is a special sentinel,
which means to expire when the user closes their browser.
path and domain (optional) parameters are used to limit a cookie
to a specific folder in a Web site (the path) or to
a specific domain, so this might be used to limit a
cookie to a subdomain, such as learn.radford.edu.
Using the path option, you could limit a cookie to exist only
while a user is in, say, the user/jellybeans folder of
the domain: setcookie('name','value',time() + 3600, '/user/jellybeans').
Now, that cookie won't be shared even w/ most other scripts on the same host
(provided the browser elects to use cookies as-intended).
secure value (optional) dictates that a cookie should
only be sent over a secure HTTPS connection. A value
of 1 indicates that a secure connection must be used,
whereas, 0 indicates that a secure connection isn't
required. setcookie(name,value,time()+3600,'','',1);
By default, you should use this option unless
you have a specific reason not to.
httponly (optional) — can be used to restrict access to
the cookie (for example, preventing a cookie from being
read using Javascript) but isn't supported by all browsers.
By default, you should use this option unless
you have a specific reason not to.
This last flag reminds us: if the browser has a bug where
it might give out cookies to other sites, or an attacker
can gain other access to the folder where cookies are stored),
then there is a privacy vulnerability.
It is essential to remember:
Cookies are created by the server, but stored on the client machine.
This explains why you have to call setcookie (rather than
just assign $_COOKIE['hamburger-price'] = ...):
You actually want to send cookies to the browser to remember (which is what
setcookie(…) does, and what assigning
to an array can't do).
It also explains why:
You must call setcookie before sending any other HTML —
the set-cookie info is sent to the browser as part of the http header, which
must be sent before any of the HTML data.
Deleting a Cookie -
Delete the existing cookies by sending blank cookies and complete the PHP code.
setcookie('name','',time()-600);
Better yet,
to try to delete cookies even if the client's clock is wildly off
by hours or months or years:
set the timeout to 1 (one second after the epoch-start).
You might think of setting it to 0, but remember that that value is
used as a sentinel, meaning end-of-browser-session expiration.
There is a knee-jerk reaction that cookies are bad,
because they somehow spy on you,
magically telling hackers and the NSA every site you visit.
While there are valid concerns, but and it's a bit confusing as to how this might
be achieved.
remember: a cookie can only be used to store information that
the server gives you,
and that your browser chooses to keep,
and later sends (back?) to a server.
Limiting third-party cookies,
using secure (https-only) cookies,
and
disallowing javascript access (httponly) to cookies
are all wise ways of limiting access.
(And, they should be the defaults unless the programmer
requests otherwise, but that’s not the case.)
There are reasonable examples where
developers might want third-party cookies:
For example, kongregate.com is a common portal for javascript games,
but individual developers (who have their games on several different platforms,
and have their own game-state-servers)
want to let a visit to kongregate.com also share info with their
servers.
...
To ponder: How do other solutions (like oauth and kerberos) compare?
Example:
Remember, (hosted) images are often stored on a different server than
the page's text/html data.
Cookies can be set on any http request, including retrieving images!
When you request cia.gov, the response includes the
html “<img src='http://lotofbanners.com/qwerty.jpg' … />”.
Sure enough, your browser makes a request to lotofbanners.com, which responds
with the requested jpeg, and also has your browser set the
cookie victimID = 47 for the domain lotofbanners.com.
Later, you request mediawiki.org, whose response also include
the html for the same banner.
Sure enough, your browser makes a request to lotofbanners.com — but this time
your browser is passing along the cookie victimID = 47 —
and now that site knows that whoever victimID=47 is, that person has seen their ad twice.
(In this example, we're presuming that lotofbanners.com is storing the exact ads seen
by each victimID.)
This doesn't seem too bad — as written, lotofbanners.com doesn't actually
know who you are,
just that the same person viewing the current banner has previously seen certain other banners.
But this can be leveraged:
If they name their banners “qwerty-for-cia.jpeg” and
“qwerty-for-mediawiki.jpeg”
and so on, then
they can now know,
out of all this sites they give banners for,
which of those sites you've visited (and when).
Note that separately, just knowing a large chunk of browser history can be suprisingly specific,
when you include
specific-amazon-products-looked-at,
which takeout-restaurant-phone-numbers you're looking up,
what political-candidate-webpages you're viewing, and
what medical-info-pages you look at
—
from this
it is a not-unreasonable-step that one could conceviably
narrow down, with decent confidence,
somebody's neighborhood, diseases, how they vote, and what their favorite pizza topping is.
BUT, it would require a single company to be hosting banners/ads for lots and lots of different
companies,
so perhaps this isn't too big a worry?
Well, one last thought:
huge numbers of websites outsource to google-analytics, to get info about usage.
These google-third-party cookies can be combined with the exact google searches you make
and your gmail contents,
which can give that company a vast trove of highly specific information.
It's a good thing they use their power for good only!
(… until NSA gives a court-order,
or just plain steals the data from wiretaps placed on intercontinental data trunks,
or a hacker-or-disgruntled-employee gets access to their database, …).
1 Kinda like emails that include/repeate/quote
the entire preceding thread.
↩
2
Although php's setcookie is given a timestamp,
the representation actually in the http packet can be
a formatted date string.
So there isn't any Y2K38 problem in http.
↩