Learn basics of web application security
Learn basics of web application security
Modern web development has many challenges, and of those security is both very important and often under-emphasized. While such techniques as threat analysis are increasingly recognized as essential to any serious development, there are also some basic practices which every developer can and should be doing as a matter of course.
Before jumping into the nuts and bolts of input and output, it's worth mentioning one of the most crucial underlying principles of security: trust. We have to ask ourselves: do we trust the integrity of request coming in from the user’s browser? (hint: we don’t). Do we trust that upstream services have done the work to make our data clean and safe? (hint: nope). Do we trust the connection between the user’s browser and our application cannot be tampered? (hint: not completely...). Do we trust that the services and data stores we depend on? (hint: we might...)
An HTML document is really a collection of nested execution contexts separated by tags, like
Whether you are writing SQL against a relational database, using an object-relational mapping framework, or querying a NoSQL database, you probably need to worry about how input data is used within your queries.
The database is often the most crucial part of any web application since it contains state that can't be easily restored. It can contain crucial and sensitive customer information that must be protected. It is the data that drives the application and runs the business. So you would expect developers to take the most care when interacting with their database, and yet injection into the database tier continues to plague the modern web application even though it's relatively easy to prevent!
Sometimes we encounter situations where there is tension between good security and clean code. Security sometimes requires the programmer to add some complexity in order to protect the application. In this case however, we have one of those fortuitous situations where good security and good design are aligned. In addition to protecting the application from injection, introducing bound parameters improves comprehensibility by providing clear boundaries between code and content, and simplifies creating valid SQL by eliminating the need to manage the quotes by hand.
As you introduce parameter binding to replace your string formatting or concatenation, you may also find opportunities to introduce generalized binding functions to the code, further enhancing code cleanliness and security. This highlights another place where good design and good security overlap: de-duplication leads to additional testability, and reduction of complexity.
While we're on the subject of input and output, there's another important consideration: the privacy and integrity of data in transit. When using an ordinary HTTP connection, users are exposed to many risks arising from the fact data is transmitted in plaintext. An attacker capable of intercepting network traffic anywhere between a user's browser and a server can eavesdrop or even tamper with the data completely undetected in a man-in-the-middle attack. There is no limit to what the attacker can do, including stealing the user's session or their personal information, injecting malicious code that will be executed by the browser in the context of the website, or altering data the user is sending to the server.
We can't usually control the network our users choose to use. They very well might be using a network where anyone can easily watch their traffic, such as an open wireless network in a café or on an airplane. They might have unsuspectingly connected to a hostile wireless network with a name like "Free Wi-Fi" set up by an attacker in a public place. They might be using an internet provider that injects content such as ads into their web traffic, or they might even be in a country where the government routinely surveils its citizens.
If an attacker can eavesdrop on a user or tamper with web traffic, all bets are off. The data exchanged cannot be trusted by either side. Fortunately for us, we can protect against many of these risks with HTTPS.
Browsers have a built-in security feature to help avoid disclosure of a cookie containing sensitive information. Setting the "secure" flag in a cookie will instruct a browser to only send a cookie when using HTTPS. This is an important safeguard to make use of even when HSTS is enabled.
When developing applications, you need to do more than protect your assets from attackers. You often need to protect your users from attackers, and even from themselves.
As a stateless protocol HTTP offers no built-in mechanism for relating user data across requests. Session management is commonly used for this purpose, both for anonymous users and for users who have authenticated. As we mentioned earlier, session management can apply both to human users and to services.
We discussed how authentication establishes the identity of a user or system (sometimes referred to as a principal or actor). Until that identity is used to assess whether an operation should be permitted or denied, it doesn't provide much value. This process of enforcing what is and is not permitted is authorization. Authorization is generally expressed as permission to take a particular action against a particular resource, where a resource is a page, a file on the files system, a REST resource, or even the entire system.