Why we need to know about CORS?
    PROGRAMMING / NETWORK / WEB

    Why we need to know about CORS?


    In this post, I’m going to talk about the CORS(Cross-Origin Resource Sharing) policy which every web developer would have heard of at least once. In fact, in web development, it is not an exaggeration to say that errors occur with CORS policy violations are very common, and every web developer will experience it at least once.

    When I was in college, I first encountered this issue while making a toy project with my friends. At the time, I was making a client application in a local environment, and I tried to communicate with the API of a development server made by a friend.

    This was a simple task to communicate with the API server to receive data, but as soon as I tried to communicate, an unknown red message appeared on the console.

    🚨 Access to fetch at ‘https://api.lubycon.com/me’ from origin ‘http://localhost:3000’ has been blocked by CORS policy: No ‘Access-Control-Allow-Origin’ header is present on the requested resource. If an opaque response serves your needs, set the request’s mode to ‘no-cors’ to fetch the resource with CORS disabled.

    what CO...What...? Access control allow origin...?

    If I think about it now, it’s an error message that kindly tells me how to fix it, but I didn’t know what to do after seeing this message because I had little experience in developing web applications at the time.

    The basics of CORS

    As such, all of the CORS-related issues we encounter arise from violations of CORS policies. Developers may be annoyed as there are more things to care about when developing because of that policy, but in fact, there is a barrier like CORS, so we can get a minimum guarantee that the resources we bring from other server are safe.

    CORS is short for Cross-Origin Resource Sharing. Actually “Cross Origin” means “Different Origin”. but I think that non-English people hard to understand this because we don’t understand the nuances of the English word Cross.

    So I’m writing this post using the word “Different origin” instead of “Cross Origin” for a little easier understanding of non-English speaking people like me.

    Then, before looking at the resource sharing between different origins, let’s simply dig into what exactly this Origin means and proceed.

    What is Origin?

    URLs like https://google.com, which means the location of the server, look like a single string, but are actually made up of several components.

    uri structure

    The origin means Protocol and Host, and port numbers such as :80 and :443. In other words, it puts together the most basic things we need to find the location of the server.

    Also, as shown in the figure above, the port number in the origin can be omitted, because the default port number of the HTTP and HTTPS protocols used in each web is set. We can find the default port number defined in the RFC 2616 document where HTTP is defined.

    3.3.2 http URL


    If the port is empty or not given, port 80 is assumed. The semantics are that the identified resource is located at the server listening for TCP connections on that port of that host, and the Request-URI for the resource is abs_path (section 5.1.2).

    If a port number is explicitly included in the origin, such as ”https://google.com:443”, it is recognized that the origin is the same only if all port numbers match. However, since clear rules for this case are not established as standards, the evaluation of the origin may vary in some cases.

    We can easily find out origin which application is running from by accessing the ʻoriginproperty of theLocation` object from the browser’s developer tools console.

    console.log(location.origin);
    "https://evan-moon.github.io"

    SOP(Same-Origin Policy)

    There are two policies in the web ecosystem that limit requests for resources from other origins. One is CORS, the topic of this post, and the other is SOP(Same-Origin Policy).

    SOP is a web security policy first defined by RFC 6454 in 2011. It is literally a policy with the rule that “you can only share resources from the same origin”.

    However, the web is an open space environment, and it is very common to bring and use resources from different origins, so it cannot be banned without exception. So they defined some exception cases and decided to allow requests for resources contained in these exceptions, even if the resources originate differently. One of the exceptions is “request for a resource that does not violate the CORS policy”. (The name CORS first appeared in 2009, which is faster than SOP.)

    Access to network resources varies depending on whether the resources are in the same origin as the content attempting to access them.

    Generally, reading information from another origin is forbidden. However, an origin is permitted to use some kinds of resources retrieved from other origins. For example, an origin is permitted to execute script, render images, and apply style sheets from any origin. Likewise, an origin can display content from another origin, such as an HTML document in an HTML frame. Network resources can also opt into letting other origins read their information, for example, using Cross-Origin Resource Sharing.

    RFC 6454 - 3.4.2 Network Access

    If we request a resource from a different origin, it violates the SOP policy, and if we do not comply with the CORS policy, which is an exception to the SOP, the resource of a different origin cannot be used at all.

    In other words, the act of restricting the use of resources of different origins is not a matter determined by only one policy. If the exception case defined in the SOP and the cases that can use CORS do not match, we may can not use resources from other origins.

    But why are these people making such annoying policies to bother developers? As we know, the developer would write the code to communicate only with the specified server anyway.

    trust Can't you just trust the developer...?

    However, if we think about it a little more, the environment in which two applications in different origins can communicate freely is a pretty dangerous environment.

    We must not forget that client applications, especially client applications running on the web, are too vulnerable to malicious attacks from users.

    When users use Chrome’s developer tools, they can check without any restrictions on a number of important information, such as the structure of the DOM, which server the client communicates with, and where the resource comes from.

    Some say it’s hard to read because we uglify the JavaScript source code, but uglify is only uglify, not encryption.

    And even if the source is uglified, it’s not even incomprehensible to humans, and being able to directly view the source code is a very dangerous part of security.

    In this situation, if there are no rules for applications of different origins to communicate with each other, then the bad user can reads the source code and uses a method like CSRF(Cross-Site Request Forgery)XSS(Cross-Site Scripting) to pretend the code was executed in your application.

    hacker
    I think that gonna be fun to inject my script on this site...!

    Now, we keep talking on the subject of the same origin and different origins, so exactly in what cases does the web evaluate the origins as the same, and in what cases does the origins differ?

    Same origin and different origin

    The logic for determining that the two origins are the same is very simple. Among the components of the two origins, only the three components, Scheme, Host, and Port, need to be the same.

    For example, the origin of my blog, https://evan-moon.github.io, is recognized as the following.

    URL isSame reason
    https://evan-moon.github.io/about O Scheme, Host, Port is same
    https://evan-moon.github.io/about?q=안뇽 O Scheme, Host, Port is same
    https://user:password@evan-moon.github.io O Scheme, Host, Port is same
    http://evan-moon.github.io X Scheme is different
    https://api.github.io X Host is different
    https://evan-moon.naver.com X Host is different
    https://evan-moon.github.com X Host is different
    https://evan-moon.github.io:8000 ? Depends on browser spec

    In the last case, if my blog origin has a port number, such as https://evan-moon.github.io:80, it is evaluated as a clearly different origin. However, in this case it is ambiguous to judge because the port number is not included to example origin. RFC 6454’s Comparing Origins section presupposes “If the two origins are scheme/host/port triples…”, and the implementation of this rule may vary depending on how it is interpreted.

    So in this case, each browser’s own origin comparison logic is followed.

    ie is trash
    Internet Explorer is the only browser that completely ignores port numbers when comparing origins.
    So let's throw it to the trashcan now.
    - memdroid -


    The important thing is that this origin comparison logic is not a specification implemented in the server, but a specification implemented in the browser.

    Even if we make a request for a resource that violates the CORS policy, the server responds normally unless the server has logic that only allows requests from the same origin. And the browser checks the headers of this response, and if it determines that the request it sent violates the CORS policy, the response is discarded without using the response.

    cors The server responds normally even if the request violates CORS,
    and the browser decides whether to discard the response

    In other words, CORS is a policy included in the browser implementation specification, so this policy does not apply when communicating between servers without going through a browser. so, even if an error occurs in the client application due to a resource request that violates the CORS policy, only logs about succeessful responded is recorded in server. So if we don’t know exactly how CORS works, it can be difficult to solve CORS problem.

    How does works CORS?

    Now, we will have a close look at how we can safely use resources with different origins.

    Basically, when a web client application requests a resource of a different origin, the HTTP protocol is used, and the application includes the origin of the current resource request in the Origin field in the request header.

    Origin: https://evan-moon.github.io

    Thereafter, when the server responds to this request, it returns the “origin allowed to access this resource” in the “Access-Control-Allow-Origin” field of the response header. And the browser compares the Origin in the request header sent by itself with the Access-Control-Allow-Origin in the response header sent by server, and determines whether this response is a valid response.

    Although the basic flow of CORS is simple, in fact, the way CORS works is changed according to three scenarios rather than one. That’s why it will be easier to fix errors due to CORS policy violations if we understand what scenario our resource request is in.

    Preflight Request

    Preflight is the most common scenario we encounter when developing web applications. In this scenario, the browser does not send the request at once, but divides it into a preflight request and a main request and sends them to the server.

    And for this preflight request, the HTTP method OPTIONS is used. The role of the preflight request is to make sure it is safe to send this request by the browser itself before sending it.

    This process can be roughly expressed as a simple flow chart.

    cors preflight 브라우저는 본 요청을 보내기 전 예비 요청을 먼저 보내고, 요청의 유효성을 검사한다

    When we use JavaScript’s fetch API to tell the browser to fetch a resource, the browser sends a preflight request before sending the main request to the server. And in response to this preflight request, the server sends back to the browser with information about what was allowed and what was banned in the response header.

    After that, the browser compares the preflight request sent by itself with the permission policy included in the response from the server, and if it determines that it is safe to send main request, it sends the request again to the same endpoint. Then, when the server responds to this main request, the browser finally passes this response data to JavaScript.

    This flow can be reproduced simply in the browser’s developer tools console. In my blog environment, if we send a request to the RSS file resource of my Tistory blog, we can check that the browser sends a preflight request using the ʻOPTIONS` method before sending main request.

    const headers = new Headers({
      'Content-Type': 'text/xml',
    });
    fetch('https://evanmoon.tistory.com/rss', { headers });
    OPTIONS https://evanmoon.tistory.com/rss
    
    Accept: */*
    Accept-Encoding: gzip, deflate, br
    Accept-Language: en-US,en;q=0.9,ko;q=0.8,ja;q=0.7,la;q=0.6
    Access-Control-Request-Headers: content-type
    Access-Control-Request-Method: GET
    Connection: keep-alive
    Host: evanmoon.tistory.com
    Origin: https://evan-moon.github.io
    Referer: https://evan-moon.github.io/2020/05/21/about-cors/
    Sec-Fetch-Dest: empty
    Sec-Fetch-Mode: cors
    Sec-Fetch-Site: cross-site

    If we check the header of the preflight request sent by the browser as in the example above, we can see that information on the main request to be sent after the preflight request is also included, not just information about Origin.

    The browser notice to server that it will use the Content-Type field or the GET method in the main request through the Access-Control-Request-Headers field or the Access-Control-Request-Method field of the preflight request.

    browser has sent a preflight request to the Tistory server, now Tistory server to send a response to preflight request which browser sent.

    OPTIONS https://evanmoon.tistory.com/rss 200 OK
    
    Access-Control-Allow-Origin: https://evanmoon.tistory.com
    Content-Encoding: gzip
    Content-Length: 699
    Content-Type: text/xml; charset=utf-8
    Date: Sun, 24 May 2020 11:52:33 GMT
    P3P: CP='ALL DSP COR MON LAW OUR LEG DEL'
    Server: Apache
    Vary: Accept-Encoding
    X-UA-Compatible: IE=Edge

    What we need to note in this response is the value Access-Control-Allow-Origin: https://evanmoon.tistory.com included in the header.

    티스토리 서버는 이 리소스에 접근이 가능한 출처는 오직 https://evanmoon.tistory.com 뿐이라고 브라우저에게 이야기해준 것이고, 필자가 이 요청을 보낸 출처는 https://evan-moon.github.io이므로 서버가 허용해준 출처와는 다른 출처이다. The Tistory server told the browser that the only origin that could access this resource is https://evanmoon.tistory.com. However, the origin I sent this request to is https://evan-moon.github.io, which is different from the origin allowed by the server.

    Therefore, the browser determines that this request violates the CORS policy and throws the following error.

    🚨 Access to fetch at ‘https://evanmoon.tistory.com/rss’ from origin ‘https://evan-moon.github.io’ has been blocked by CORS policy: Response to preflight request doesn’t pass access control check: The ‘Access-Control-Allow-Origin’ header has a value ‘http://evanmoon.tistory.com’ that is not equal to the supplied origin. Have the server send the header with a valid value, or, if an opaque response serves your needs, set the request’s mode to ‘no-cors’ to fetch the resource with CORS disabled.

    Here, we are confused because the error is displayed in red on the console window even though an error does not occur in the response to the prelight request and 200 status code is returned normally. In fact, errors caused by CORS policy violations have little to do with the success of the preflight request.

    This is because the browser determines whether the CORS policy is violated after receiving the response to the preflight request.

    Of course, even if the preflight request fails, it may be treated as a CORS policy violation, but more important thing is whether there is a valid Access-Control-Allow-Origin value in the response header. not whether the preflight request succeeds or fails. In other words, even if the preflight request fails and a status code 200 is not returned, if that value is properly entered in the header, it means that it is not a CORS policy violation.

    In most cases, the preflight scenario is used, in which the preflight request and the main request are divided, but in all situations, the request is not sent twice. Although this is a tricky condition, in some cases, it checks for CORS policy violations only with the main request without a preflight request.

    Simple Request

    There is no official name for this scenario, but MDN’s CORS document calls this scenario as Simple Request, so I will just call it Simple Request.

    In the simple request scenario, after sending main request to the server without sending a preflight request, the server sends a value such as Access-Control-Allow-Origin in the response header of this main request, then the browser checks CORS policy violations. In other words, the preflight scenario and the simple request scenario have the same overall logic, and differ only in the presence or absence of a preflight request.

    simple request The simple request scenario is a scenario in which CORS violations are checked with the main request without a preflight request.

    However, we can use a simple request. the preflight request can be omitted only when certain conditions are satisfied. Moreover, it is quite difficult to meet this requirement, so if we design your web application architecture in a generic way, it is almost impossible to meet it. So I hardly ever experienced such a case.


    1. The request method must be one of GET, HEAD, and POST.
    2. Headers other than Accept, Accept-Language, Content-Language, Content-Type, DPR, Downlink, Save-Data, Viewport-Width, and Width should not be used.
    3. If Content-Type is used, only application/x-www-form-urlencoded, multipart/form-data, and text/plain are allowed as values of this field.

    In fact, if we want to follow the condition 1, we just do not use some methods such as PUT or DELETE. So actually it is not difficult. However, it is not condition 2 or 3.

    Since the header fields specified in condition 2 are really basic fields, it is rare that an additional field is not used in a complex commercial web application. Even the Authorization header used for user authentication is not included in that condition.

    In addition, since we usually design HTTP APIs to have a content type of text/xml or application/json, it is not so easy to create a situation that satisfies all of these conditions in reality.

    Credentialed Request

    The third scenario is how to use credential requests. This scenario is not the basic scenario of CORS, but it is additionally used when we want to strengthen the security in communication between different origins.

    Basically, the XMLHttpRequest object or fetch API, which is an asynchronous resource request API provided by the browser, does not include the browser’s cookie information or authentication-related headers in the request without any options. At this time, if we can include authentication-related information to request with credentials option.

    A total of 3 values can be used for this option, and the meaning of each value is as follows.

    Option Description
    same-origin (default) Authentication-related information can be included in request between same origin
    include Authentication-related information can be included in any request
    omit Authentication-related information can not be included in any request

    If we use options such as same-origin or include to include authentication information in a resource request, now the browser add some strict rules to CORS policy conditions, not only check the Access-Control-Allow-Origin field.

    Let’s take a closer look at which rules have been added through communication with the local environment where I am writing this post and the Github server hosting my blog.

    In my blog, the value of the Access-Control-Allow-Origin field is set to *, which means that all origins are allowed. So when someone requests a resource from other origin to my blog server, they are not constrained by CORS policy violations.

    So, even in a local development environment such as http://localhost:8000, we can request and retrieve resources at will by using fetch API.

    fetch('https://evan-moon.github.io/feed.xml');
    Request
    GET https://evan-moon.github.io/feed.xml
    
    Origin: http://localhost:8000
    Referer: http://localhost:8000/2020/05/21/about-cors/
    Response
    GET https://evan-moon.github.io/feed.xml 200 OK
    
    Access-Control-Allow-Origin: *
    Content-Encoding: gzip
    Content-Length: 1132748
    Content-Type: application/xml
    Server: GitHub.com
    Status: 200

    Also, the default value of credentials in Google Chrome browser is same-origin, which means that authentication-related information is used only within the same origin. So, in local environment, the resource request sent to https://evan-moon.github.io cannot contain authentication-related information such as browser cookies.

    That is why the browser simply checks the value of Access-Control-Allow-Origin: * and concludes that “This request is safe”. However, if I change the credentials option to include, which means to include authentication-related information in all requests, and send the same request, the situation is a little different this time.

    fetch('https://evan-moon.github.io/feed.xml', {
      credentials: 'include', // Credentials is changed to "include"!
    });

    As we can see if we run this code in the browser console, this time, the credentials: include option was used to set the request to include authentication-related information unconditionally regardless of the same origin. So we can check that this request contains the browser’s cookie information.

    The Github server hosting my blog sent the same response this time, but the browser behavior changed a bit.

    🚨 Access to fetch at ’https://evan-moon.github.io/feed.xml’ from origin ’http://localhost:8000’ has been blocked by CORS policy: The value of the ‘Access-Control-Allow-Origin’ header in the response must not be the wildcard ’*’ when the request’s credentials mode is ‘include’.

    Browsers are saying that if the credentials mode is include, we should not use * in the Access-Control-Allow-Origin header field which meaning that all requests are allowed.

    Like this, when requesting a resource from other origins while authentication-related information is included in the request, the browser adds the two rules for checks for CORS policy violations.


    1. * can not be used in Access-Control-Allow-Origin field, and this value is must be an explicit URL.
    2. Access-Control-Allow-Credentials: true is must contained in response header.

    This scenario, including authentication, may feel a bit more complicated than other scenarios. but if we know about various scenarios for CORS policy like this can greatly shorten the time spent wandering in case of a problem caused by a violation of CORS policy in real life. So, It is recommended to be familiar with it.

    How to solve CORS policy violations problem.

    So far, we have looked at what CORS is and in what situations it is applied and violated. In this section, let’s look at what can be done when a problem occurs due to a violation of the CORS policy.

    Setting Access-Control-Allow-Origin

    The most representative way to solve the problem caused by CORS policy violation is that to just set an valid value in the Access-Control-Allow-Origin header in the server.

    Using the wildcard `*’ means that we are accepting requests from all origins, so you can be comfortable right now. However, if you think about it in a different sight, it means that we will receive all requests from strange origins that you don’t know about. Therefore, if we solve the problem in this way, serious security issues may arise.

    So, even if it’s bother as much as possible, try to specify the origin like Access-Control-Allow-Origin: https://evan.github.io.

    This header field can be added to the configuration file of server engines such as Nginx or Apache, but in this case, it is inconvenient to make complex settings, so I recommend setting it using response middleware in the source code.

    Named backend frameworks such as Spring, Express, and Django all provide CORS-related configuration settings or middleware libraries, so the setting itself will not be difficult.

    Reverse proxying with Webpack Dev Server

    In fact, it is no exaggeration to say that the most common violation situation of CORS policy is the case of developing front-end applications in a local environment. because it is rare to put a general origin such as http://localhost:3000 in this important field such as Access-Control-Allow-Origin.

    Most front-end developers use Webpack and webpack-dev-server to build a development environment on their local machine. If we use the proxy feature provided by these libraries, we can bypass CORS policy very comfortably.

    module.exports = {
      devServer: {
        proxy: {
          '/api': {
            target: 'https://api.evan.com',
            changeOrigin: true,
            pathRewrite: { '^/api': '' },
          },
        }
      }
    }

    With this configuration, requests to URLs starting with /api in the local environment are proxyed to https://api.evan.com. In other words, browser thinks that it sent a request to localhost:8000/api, but in fact, Webpack is proxying the request to https://api.evan.com in background. So we can tricks the browser as if it followed the CORS policy.

    trap Webpack's Trap Card Reverse Proxying has been activated!

    Even if you have built your own development environment with a combination of webpack-dev-middleware and Node server, don’t worry as you can easily configure proxy by using the http-proxy-middleware library. (webpack-dev-server internally uses http-proxy-middleware)

    I recommend you that use this method when serving the origin of the client application and the origin of the API server are the same.

    Of course, in the local development environment, there is nothing wrong with webpack proxying requests, but after building the application and deploying it to the server, it is no longer an environment where webpack-dev-server runs. If so, the request will no longer be proxyed and the API request will be sent to where the client application is served, not the API server, and the request will fail.

    For example, if the origin of the API server is https://api.evan.com and the origin of the server serving the client application is https://www.evan.com, the following situation occurs.

    fetch('/api/me');
    in local environment...
    GET https://api.evan.com/me 200 OK
    
    There is no proxying logic in production environment...
    GET https://www.evan.com/api/me 404 Not Found

    Of course, we can also use the API host according to each environment by using environment variables such as process.env.NODE_ENV within the business logic. However, I avoid it because I don’t think it’s a good idea to include the source for this development environment in business logic.

    Wouldn’t it be possible to put the request in the img tag?

    I said that there are several exceptions to the SOP policy that allow access to resources from different origins, and one of them is a request to comply with the CORS policy. And also there are other exceptions such as executable scripts, images to be rendered, and style sheets.

    So wouldn’t it be possible to bypass CORS policy violation by sending a request with a different exception case…? Like this!

    <img src="https://evanmoon.tistory.com/rss">
    <script src="https://evanmoon.tistory.com/rss"></script>

    Well, if we request a resource in this way, the request succeeds without violating the CORS policy. And if we look closely at the headers of these requests in the network tab of the browser’s developer tools, we can see that they contain a value of Sec-Fetch-Mode: no-cors.

    This Sec-Fetch-Mode header is a field that configures the request mode. If the value of this field is no-cors, the browser does not check for CORS policy violations even resource is served from other origins. But sadly, the browser doesn’t tell JavaScript the response of requests with this value in request header. It means we can never access the body of this response from within JavaScript code.

    I was also curious if there really was no way to do this, so I tried several ways, and as a result all failed. So let’s just give up on trying to circumvent CORS policy violations and follow the CORS policy as the smart guys tell us.

    Epilogue

    Perhaps the most difficult thing when solving problems caused by CORS policy violations is that the someone who has the problem and the someone who needs to solve the problem is different.

    As I said, CORS policies are implementation specifications for browsers, so most of the people who actually have problems with CORS policy violations are front-end developers. However, in order for us to solve this problem, the backend developer needs to set correct value in ʻAcccess-Control-Allow-Origin` field of server application’s response header.

    Of course, front-end developers can solve it by themselves using the proxying option of webpack-dev-server which I talked about, but this method works only in a local development environment. In other words, it is not a solution to problems in an production environment.

    Therefore, in the end, in order to properly solve the problem of CORS policy violations, it is inevitable that we need help from backend developers.

    In fact, resolving CORS policy violations is not so difficult and complex, so if one of the front-end or back-end developers is familiar with these policies, they can solve it quicker and easier than you think.

    However, it is hard to find a solution if neither the frontend nor the backend has experience with this problem, such as when I first experienced a CORS policy violation. It’s a very subtle problem.

    Evan Moon

    🐢 거북이처럼 살자

    개발을 잘하기 위해서가 아닌 개발을 즐기기 위해 노력하는 개발자입니다. 사소한 생각 정리부터 튜토리얼, 삽질기 정도를 주로 끄적이고 있습니다.