Behind The Login Screen: Understanding Web Authentication Protocols

This guide, by Marlena Erdos, was originally presented as supporting materials for her presentation to the abcd-security subgroup in October 2014. Unless otherwise noted (or obviously gathered from elsewhere, such as screenshots), all material is by Marlena Erdos.

Introduction

When you access a web-based application, sometimes you see a login screen and sometimes not. What's happening behind the scenes for "login"? And what's happening in between the times when you are asked to supply your username and password?



 
The material below (originally created in support of a talk, and hence not entirely complete), focuses on helping you understand what a "protocol" is, and how all web authentication protocols — CAS, SAML/Shibboleth, and Harvard's own PIN protocol — use the underlying HTTP protocol to accomplish securely authenticating a user to a service. It also covers what happens in between the times those protocols are in action.

The guide is mostly text — not pictures. But a powerful analogy to an everyday life activity will help you understand web authentication protocols in a fairly deep way even if you aren't particularly technical. This guide addresses three questions:

  • Why do you sometimes see a login screen and sometimes not? (Quick answer: Your "authentication session state.")
  • When you do see a login screen, what's happening behind the scenes? (Quick answer: A "web authentication protocol.")
  • When you aren't asked to log in, why? (Quick answer: Your session state with your app or its authentication system is "good.")

By the end of this guide, you should better understand ...

  •  What a "protocol" is.
  •  How "sessions" are used in "ongoing authentication."
  • How all web authentication protocols — CAS, SAML/Shibboleth, OAuth, and Harvard's own PIN protocol — use the underlying HTTP protocol to accomplish authenticating a user to a web application securely.
  • Some commonalities among these authentication protocols.

Please note that this guide isn't a side-by-side comparison of all the protocols. It's also not a guide to the mechanisms used to secure messages — digital signatures and bulk encryption — though here is a quick description of these:

  • Digital signature: A way for the recipient to know who sent a message, as well as tamper-evident packaging for the message.
  • Encrypted channel or bulk encryption: A way to prevent eavesdroppers from getting any meaningful info. (HTTPS does bulk encryption.)

Why Bother With Web Authentication Protocols (and Systems)?

Historical Evolution of Apps, Authentication Repositories, and Authentication Protocols

In the early days of Web applications, each app had its own name/password database and used a homegrown protocol for authentication.


  • Drawback: Each user needed separate name/password combinations at many different sites — which is a management headache for the user.
  • Benefit: Compromise of a given user's account on one app didn't necessarily affect the user's accounts on other apps.

As technology and standards progressed, an app in an organization might use an LDAP directory (and its associated protocol) for name/password storage and validation – with the same LDAP used by other apps in the organization.

  • Benefit: Each user had only one name/password combination.
  • Drawbacks: Each app has its hand on user passwords! Not cool – and not secure.
    • A Mal that subverts a single app, then subverts multiple people.

    
The next wave was the use of third-party authentication systems and associated protocols
 — often called single sign-on (SSO) systems — designed to manage usernames and passwords and authenticate users on behalf of an app. Both the users and the participating apps trust the SSO system, and the SSO system helps the apps trust the user. (Some SSO systems, such as Kerberos, help the user trust the app too.) In SSO settings, authentication protocols consist of the messages an app can use to request authentication of a user, and the response messages from the authentication system. A "Web authentication protocol" uses the features of HTTP – itself a protocol – to accomplish the authentication task.

    "Third-party authentication system" means that something other than the app is managing the name and passwords for users. Or, more technically, that the third party is managing authentication between you (the first party) and the app (the second party).

    At Harvard, PIN, CAS, SAML/Shibboleth, and OAuth are examples of protocols for third-party authentication systems. They use HTTP, which makes them Web authentication protocols. (Kerberos is also a third-party authentication system, but is not Web-based because it doesn't use HTTP for its protocol — at least it didn't in its original form, which predated HTTP.)

    What's a Protocol?

    In computer communications, a protocol is a set of messages (often with an order to them) and their formats, with some content defined typically.

    A Real-Life Example: The Post Office

    Messages and operations — in other words, requests — you might make at the post office include:

    • Please send normal mail.
    • Please forward my address.
    • Please mail this with a return receipt.

    There are return messages — in protocol lingo, responses — that depend on what you supplied in your request and inherently convey a status:

    • "OK, you're all set."
    • "Hey, no stamp!"
    •  "You didn't fill this receipt out correctly, but come to the front of the line when you do." 

    Obviously, the postal worker is looking at the format of your envelope (and return receipt, etc). And format correctness affects the response (and status) you get.

    You, the user, take different actions depending on the status in the response. 

    These concepts — format correctness and response status — apply to both web authentication protocols and  HTTP.

    Protocol Words of Use

    • Body (or "message body"): In post-office terms, what's inside the envelope.
    • Header (or "message header"): In post-office terms, what's on the envelope; for electronic requests and responses, fields (such as addressing info) that precede the message "body," such as domain, message length, language used, etc.

    Protocols Within Protocols

    Let's look at the snail-mail metaphor a bit further. The post office doesn't know anything at all about the contents of your envelope; you could be, say, ordering a part for your antique tractor from a business that isn't online. That's your "application-level" request. The owner of the store — let's call her Jan — might send you back an envelope containing a response to your message, that says "the part is in stock; send me your credit card number," or maybe "the part you want is backordered."

    She might also send you an order number that you can use in your next message, rather than filling out an order form again. In the world of protocols, that order number is an index into the session state (what you ordered, your address, and so on) that the store has in its records.

    And finally, both you and Jan are also separately engaging in a protocol with the post office. Your interaction with each other is layered over (or, rather, within) your requests and responses with the post office. What this means is that if you went to the postal clerk and asked him or her to send you a refurbished John Deere gasket, the clerk would say, "Eh?" — i.e. the carrying protocol doesn't know anything about the protocol inside the message body.

    This idea — of a protocol occurring inside of another protocol — is a key concept in the Web, and in authentication protocols.

    Most if not all web authentication protocols — such as CAS, Shibboleth/SAML, OAuth, and Harvard's own PIN — are "layered over" HTTP.

    And HTTP itself is layered over TCP, which is layered over IP, which is layered over a link level-protocol, which is layered over a physical medium protocol.

    Like nested Russian dolls, one protocol fits inside another, which fits inside another and so on.

    "Seven Layer Model" and the Five Actual Protocol Layers

    The historical OSI network protocol model consists of seven nested layers. These are Application, Presentation, Transport, Network, Data Link, and Physical.

    In today's Internet, what we have are the following actual five protocol layers: Application, HTTP, TCP, IP, link (ethernet, wireless), and the physical medium.  

    These correspond to the following layers in the OSI model: Application (for the actual application and HTTP), Transport (for TCP), Network (for IP), Data Link (ethernet, wireless), and Physical.

    Web authentication protocols are "application" protocols that ride over the HTTP application protocol — which uses TCP, then IP, and then a link and physical layer.

    Web Authentication Protocols use HTTP features – in particular, "cookies" and "redirects" – to accomplish their work.

    Upshot: We need to understand HTTP to understand Web Authentication Protocols!

    HTTP Protocol In Some Depth

    • Before HTTP, there was Telnet, FTP, and SSH layered over TCP/IP — no HTML, and no hypertext, where you can can click on a link and be taken to a new page.
    • HTTP (HyperText Transfer Protocol) is the fundamental protocol of the web.
    • Every page you see (including this one!) depends on the HTTP protocol flowing between your browser and an application fronted by a web server (or an application doing the protocol itself, though that is less common).

    HTTP (HyperText Transfer Protocol) is a strict request/response protocol. There's no response (from a web server) without a request from a browser (or other entity doing the request side of HTTP).  (Web authentication protocols are also request/response, but aren't so much in lockstep. We'll get to this later.)

    By (imperfect, but adequate) analogy, the postal worker won't just give something to you (say, stamps) while you're standing in line without a prior request from you — and neither will an HTTP (web) application.

    This is entirely contrary to an end user's intuition — after all, you were asked to log in, right? That's a request, right? Yes, from the application-level standpoint. But from the HTTP perspective, it's a response. 

    HTTP Request: What Your Browser Sends to the Application

    The HTTP request header always has a Domain, a Path, and an HTTP verb (e.g GET for reading info on a site, and POST for sending information to the site — such as login info).

    The request may have a body, in the case of POST and form submission.

    HTTP Response: What the Application (via the Web Server) Sends Back to Your Browser

    The response header always has a Status Code! The header may have directives to the browser (e.g. Set-Cookie). 

    The response may also have a "body," which is the HTML that your browser will present as the page you see.

    We'll see examples shortly.

    HTTP Requests and the URL: An Anatomy Lesson

    When you type a URL into your browser, your browser uses that URL to create and then send an HTTP request header (and, possibly, a body). Let's look a URL in detail:

    https://example.com/myDirectory/myfile?param1=value1&param2=value2 

     

    Here's what's inside:

    • The protocol: https (vs http or ftp) — that is, the protocol your browser is supposed to use.
    • The domain: example.com (or x.edu or y.de or z.ca) — this then maps to an IP address, such as 128.211.23.49.
    • The path: Everything after the domain up to the “?” — a file or program (usually), and in this case "/myDirectory/myfile".
    • The query string: starts with “?”

    The parameters for the program are on the query string:

    • The format is "ParamName=Value;"
    • An ampersand “&” separates the parameters

    This is an important idea; much of the time, your HTTP request is invoking a program (such as a "servlet") rather directly retrieving a file (such as a .jpg).

    The HTTP Response

    The application, in concert with the web server, answers each HTTP request with an HTTP response. The response includes a status code in the response header. The response header also may contain directives to your browser (e.g. Set-Cookie) as well as information fields such as "Content-Length."

    The response always has a header and often (but not always) a body. The response body is the material your browser will render for you to see: HTML, image data (e.g. jpg files), and anything else that might get rendered onto your screen by your browser. In many cases, this includes application-level requests, such as login forms — in fact, all application requests to the user come in HTTP responses.

    Note: A program can never make an HTTP request to your browser. This has to do with how HTTP's enveloping protocol, TCP, works. Your computer could run a program that accepts HTTP requests, but in that case your program is acting as a server.

    A Real HTTP Request and Response — With Cookies

    Let's look at a real-world example. If I type the following into my browser's address bar

    www.washingtonpost.com/home

     

    ... the browser sends the following as a request (as exposed by "Live HTTP headers," a tool in Firefox):

    GET /home HTTP/1.1

    Host: www.washingtonpost.com

    User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:26.0) Gecko/20100101 Firefox/26.0
    Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
    Accept-Language: en-US,en;q=0.5
    Accept-Encoding: gzip, deflate
    
Connection: keep-alive

     

    Here is the response from the site. Note that four cookies are being set via "Set-Cookie":

    
HTTP/1.1 200 OK   

    Last-Modified: Thu, 30 Jan 2014 00:00:34 GMT

    Content-Type: text/html;charset=UTF-8

    Content-Encoding: gzip
    Content-Length: 92556
    Date: Thu, 30 Jan 2014 00:04:47 GMT

    Set-Cookie: client_region=0;Expires=Thursday, 30-January-2014 00:14:47 GMT; path=/; domain=.washingtonpost.com
    Set-Cookie: X-WP-Split=X;Expires=Thursday, 01-January-1970 00:00:00 GMT; path=/; domain=.washingtonpost.com

    Set-Cookie: devicetype=0;Expires=Saturday, 01-March-2014 10:33:47 GMT; path=/; domain=.washingtonpost.com
    Set-Cookie: rpld1=20:usa|21:ma|22:cambridge|23:42.363998|24:-71.084999|;Expires=Thursday, 30-January-2014 01:04:47 GMT; path=/; domain=.washingtonpost.com

        

    This is just the header of the response — or, in our post office analogy from earlier, the envelope. The response body was the HTML for the Washington Post's homepage.


    What About Cookies?

    So, what's a cookie?

    • A name=value pair (just like URL parameters!) — e.g. devicetype=0
    • An expiry date;
    • A domain and path that tell your browser where/when to send the cookie.

    Note: A cookie could hold a lot of info, or it could just hold essentially an index into a table the application holds. (For this latter case, consider how coat check tokens are used.)

    But what's a cookie for?

    The answer: Session state — i.e. something that the server/application wants to know about you on the next request.

    Isn't this talk supposed to be about authentication?

    Yes. Cookies are a means of "ongoing authentication."

    Cookies, Sessions, and Ongoing Authentication

    Cookies are a way an application maintains a session — an ongoing relationship with you — even though from HTTP protocol view, the relationship is "done" when the response is sent. 

    Your "session" with an app typically continues over many, many HTTP requests and responses.

    An important aspect of your session with an app is your authenticated identity! (Of course, this is only for apps that require authentication.)

    Ways Cookies Can be Used to Maintain Authentication Session State (Some Bad, Some Better, One Good)

    Let's consider the following example. A user goes to catlovers.com by clicking on a link – that's an HTTP request – and getting back a login form. (That's the HTTP response). When she fills in the form with a name and password and clicks "submit', that's an HTTP POST request.

    After the application checks out her login info, it sends a response, which includes one or more "Set-Cookie" commands in the HTTP response header (just like we saw in the Washington Post example). For example, if the user submits "marlena" and "iLoveMyCatAWholeBunch222" in the login form, catlovers.com might send back in the response header the following:

    Set-Cookie: name=marlena; domain=catlovers.com
    Set-Cookie: password=iLoveMyCatAWholeBunch222; domain=catlovers.com

      

    The response also includes the HTML page for the user's browser to display — presumably pictures of cute cats, links to pages with cat health info, etc. When she clicks on a link on this page back to this same site, that will be a GET, and the cookies that got set will be sent back to catlovers.com. Let's look:

    GET /CatHealth.php  HTTP/1.1

    Host: www.catlovers.com

    User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:26.0) Gecko/20100101 Firefox/26.0

    Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
    Accept-Language: en-US,en;q=0.5

    Accept-Encoding: gzip, deflate

    Connection: keep-alive

    Cookie: name=marlena; password=iLoveMyCatAWholeBunch222

      

    All the cookies intended for this site get sent on one line.

    Some Problems With This Example

    There are some issues with this example as a means of maintaining authentication session state
. It's not a great idea to send names and passwords in HTTP headers (where they can possibly be read). Also, it's expensive to authenticate every single request with a name and password. (Typically, the service would call out to another service, often an LDAP directory, to validate the name and password. That's the expensive part.)

    An Alternative: Store Only the User Name

    One alternative method of maintaining authentication session state would be if catlovers.com stores just the user name in the cookie, and not the password:

    Set-Cookie:name=marlena

      

    This isn't great either; there's nothing to stop the user from altering the cookie in her browser's cookie store, so that it said something like "name=HillaryClinton."

    A Better Alternative: Use an Opaque Value

    Alternatively, and better, catlovers.com could store an "opaque value" in the cookie, and map it to a table with the user's name as way to maintain authentication session state.

    Set-Cookie:index=3a24

      

    This is better than storing a name, but it's still possible for the user to change the value and, if lucky, gain access to someone else's session.

    The Best Alternative: Use Encryption

    And the best alternative means: encrypt the cookie, and encrypt the channel too:

    Set-Cookie:state=2sao9d3Dfe28e3u32bvs4bwid91jd9gle7s

      

    The cookie value could be an encryption of useful session information (e.g. user name, user IP address, time cookie was touched), as well as having it be sent over HTTPS (i.e. encrypted).

    Sessions and Timer Expiry Forcing Authentication

    Sometimes you'll be using a protected site — such as HARVie — and, when you click on some link within the site, you'll suddenly be shuttled through to a login page. Why? What's likely happened is that your session is "too old," and this has triggered re-authentication. Your session can become "too old" in two ways, each tracked by a timer:

    • Inactivity timer: If you haven't touched the website for a while, you might be past its "inactivity timer."
    • 
Max session lifetime: Even if you've been active on a site, you might be past the maximum lifetime of a session.

    The application typically maintains "last touched" and "session start" information (either in the cookie, or in state the application stores). The application compares these values to its notion of "maximum inactivity" and "maximum lifetime" to determine whether re-authentication is needed.

    Summary So Far

    We now know about ongoing authentication via cookies — in other words, what's happening so that you don't see a login screen all of the time even after you've logged in. The server has set a (hopefully encrypted) cookie that contains either your name, or an index value that denotes storage corresponding to your name on the server (similar to that coat-check token).

    And we first learned about protocols. Let's check our progress against the goal set from the top of this guide:

    •  What a "protocol" is.
    •  How "sessions" are used in "ongoing authentication."
    • How all web authentication protocols use the underlying HTTP protocol to securely authenticate users to web applications.
    • Some commonalities among authentication protocols in use at Harvard.

    Next, we'll learn about HTTP redirects.

    Web Authentication Protocols and HTTP Redirects

    For this example, the user types the address for HARVie (http://harvie.harvard.edu/) into a browser. As you've probably experienced, this typically will "take you" to the login screen. That's via the mechanism of "redirects." Let's look at the request (pared down slightly) and the response:

    GET / HTTP/1.1
    Host: harvie.harvard.edu
    User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:26.0) Gecko/20100101 Firefox/26.0
     
    HTTP/1.1 302 Moved Temporarily
    Accept-Ranges: bytes
    Age: 0
    Content-Encoding: gzip
    Content-Type: text/html
    Date: Fri, 07 Feb 2014 22:57:58 GMT
    Location: https://www.pin1.harvard.edu/pin/authenticate?__authen_application=VPA_OHR_INTRANET_HARVIE3&redirect=&logintype=ANON

      

    A "redirect' is simply an HTTP response with a "300" series status code and a new URL in the "Location" field of the HTTP response header.

    This response — i.e. 300 plus a location — tells your browser to issue a new GET to the URL of the location.

    GET /pin/authenticate?__authen_application=VPA_OHR_INTRANET_HARVIE3&redirect=/&logintype=ANON HTTP/1.1
    Host: www.pin1.harvard.edu
    User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:26.0) Gecko/20100101 Firefox/26.0

     

    Some other things to notice: The application sent its name, "VPA_OHR_INTRANET_HARVIE3", as a parameter as part of its authentication request to PIN. In fact, sending its name was the bulk of the authentication request! PIN knows when it sees the parameter  "__authen_application" that an application is requesting authentication of the user.

    All web authentication protocols have the application requesting authentication send its name, though the terminology differs from one protocol to another.

    Here's a CAS authentication request from an app at example.com to the CAS server. Notice that the app is sending its name as a parameter:

    https://server/cas/login?service=http://www.example.com

       

    Putting It All Together: Cookies, Redirects, Authentication, and "Sessions"

    All the protocols — PIN, CAS, SAML/Shibboleth, and OAuth — do the same things with respect to protocol message flows, even though details such as the exact format of the authentication request and response differ.
     
    To learn more, have a look at this presentation depicting web authentication protocol flows.

    Commonalities Among Web Authentication Protocols

    In registration of the app with the authentication service, and vice versa:
        •    PIN: App registration database; app must know PIN public key and URL.
        •    SAML/Shibboleth: Both sides are registered in "metadata" containing certificates and request/response URLs.
        •    CAS: Apps must register with CAS; app must know CAS public key and URL.
        •    OAuth: Mutual registration and knowledge of certificates and other keys.
     
    No matter the method, the app always sends its name (or identifier) to the authentication service. When it comes to protecting the authentication response, PIN, SAML, and Shibboleth uses digital signatures on response sent through redirection. In CAS or OAuth, this is done via the use of a direct call over HTTPS between the app and the authentication server based on a "ticket" (CAS) or "grant" (OAuth).

    Continue to the appendix for more information on redirects and cookies in CAS, PIN, and Shibboleth.