Monday, February 9, 2015

How HTTPS works

In this blog I would like to write about 'How HTTPS works'. Let us try to simplify it and assume there is no web in this world. I want to have some transactions with my bank. Bank is in another city and to complete that transaction I need to send some secret documents to bank and also I would receive some secret documents from bank. Let say the only way to send and receive documents is through postal services. Now we know scenario, let us see what are our problems


  1. Identity of Bank: Because bank is in another city and I am not going to bank directly, how would I know bank is genuine. I have to make sure identity of bank is correct before sending documents.
  2. Security in transition: How would I make sure that no body is able to read my documents while they are in transition.

This is too much of task for me to verify identity etc, let me hire a personal assistant PA to help me in this bank transaction.

To solve first problem (Identity of Bank) we need an authority which can say yes 'So and So bank is a genuine bank'. This authority can provide a certificate to bank and bank needs to show that certificate to client (my PA). As a client my PA can check with authority that certificate is actually issued by them and its still valid. If yes, I will be sure that I am working with a genuine bank. It involves following tasks now
  a. Bank needs to get a certificate from authority
  b. Before sending documents to bank, my PA will ask for bank certificate.
  c. Bank will provide certificate
  d. My PA will verify certificate with authority and if found correct, my PA will proceed to next step otherwise it should warn me and also give me option to go-ahead with transaction or decline.

To solve second problem (Security in transition) we can lock our documents in a box and we can create two similar keys to open the box. One key can be used by bank and other key can be used by my PA. That way we are secure that no one can see/steal our documents in between. Let us name this key as Symmetric-key.

But by introducing box we have introduced two more problems
a. There are different types of boxes/locks available in market. We (bank and my PA) should come to an agreement that which kind of box/lock can be used so that we both know how to operate that box/lock.

b. We can not use same symmetric-key permanently for all our transactions. Its unsafe to store it in my home. If anybody gets hold of this key, he will be able to open our box. To avoid this my PA can generate key every time we want to initiate our transaction. But now problem is how would my PA will send key to bank. We can't send it directly using postal services as someone might take it and then later misuse it to steal my documents. To solve this problem we need another lock/key. This lock should have two different types of keys. It can be locked by a key-1 but can only be unlocked by key-2. Key-1 is public key and Key-2 is private key. Ownership of keys lies with bank and bank shares public key with everybody but private key with nobody. It means anybody who has public key can lock the box but only bank can unlock it using private key. Let us call this mechanism asymmetric-box and asymmetric-key. My PA wants to send symmetric-key to bank. He can ask bank to send him public key and then lock symmetric-key in asymmetric box using public key and send it through postal services. Now only bank can open this box using private key and get hold of symmetric key. Now bank is sending public key but there is no need to secure it. Even if anybody gets it, it does not matter as he can't unlock using public key.
To sum up 'To secure documents we are using symmetric key but to secure symmetric key we are using asymmetric key (public/private). There is no need to secure public key.'

Let us conclude our steps

1. Bank needs to generate asymmetric keys (One time activity)
2. Bank needs to get certificate from authority (One time activity)
3. Before starting any conversation my PA needs to ask bank about model of symmetric box/lock that can be used. For that my PA sends information of supported box/lock to Bank and ask which one they also support. There is no secret information here so anybody can read. No issues.
4. Bank agrees to certain box/lock model and sends that information back. There is no secret information here so anybody can read. No issues.
5. My PA asks for bank certificate, which is issued by authority. With certificate bank needs to send public key also. There is no secret information here so anybody can read. No issues.
6. My PA contacts authority and validates bank certificate. If found correct and valid, my PA will start next step. Otherwise he warns me and give me option to go-ahead or decline.
7. Let say certificate is valid or I allow my PA to go-ahead
8. We (My PA and Bank) starts conversation
9. My PA generates a random symmetric key
10. Locks it using public key and send it to bank. We are sending very sensitive symmetric key but that is locked using public key and it can only be opened by private key. No one other than bank can have private key so no one can do anything.
11. Bank unlocks symmetric key using private key and keeps it for our conversation.
12. My PA locks documents using symmetric key and send it to bank. No one other than my PA/bank has symmetric key so no one can see documents in between.
13. Bank unlocks documents using symmetric key.
14. Bank locks documents using symmetric key and sends to my PA. No one other than my PA/bank has symmetric key so no one can see documents in between.
15. My PA unlocks documents using symmetric key.
16. We continue sending documents using symmetric key until I decided to end conversation.
17. I decide to end conversation.
18. Next time when we want to contact bank again we will start with step 3. We will check bank's validity again and we will generate new symmetric key.

This is all is happening in HTTPS

Let me rephrase it now
I                      = End user
My PA            = Browser (More technically SSL layer of computer conversation)
Bank               = Secured web server
Postal services= Network
documents      = secret data (userid/password, bank accno, personal data etc)
Authority        = CA (Certificate authority as verisign etc)
Box                 = Encryption
Model of box  = Encryption mechanism
Symmetric key= Symmetric key for encryption/decryption
Asymmetric keys = Asymmetric keys for encryption/decryption
Conversation    = browser session



Now let me convert it to technical words
1. Web server which wants to support secret conversation needs to generate asymmetric keys. (One time activity)
2. Web server which wants to support secret conversation needs to get certificate from certifying authority as VeriSign etc. (One time activity) [ I would like to cover step-1 and 2 in separate blog as how to do it using keytool of java]
3. Before starting any conversation browser needs to ask bank about encryption mechanism that can be used. For that browser sends information of supported encryption mechanism (RSA, SSL version etc) to Web-Server and ask which one they also support. There is no secret information here so anybody can read. No issues.
4. Web-Server agrees to certain encryption mechanism and sends that information back. There is no secret information here so anybody can read. No issues.
5. Browser asks for bank certificate, which is issued by authority. With certificate bank needs to send public key also. There is no secret information here so anybody can read. No issues.
6. Browser contacts authority and validates web-server certificate. If found correct and valid, browser will start next step. Otherwise it warns me and give me option to go-ahead or decline. We might have seen such warning in our browser specially while accessing https site over intranet. This is general practice not to involve CA for internal websites and do self sign instead.
7. Let say certificate is valid or I allow browser to go-ahead
8. We (browser and web-server) starts conversation
9. Browser generates a random symmetric key
10. Locks it using public key and sends it to bank. We are sending very sensitive symmetric key but that is locked using public key and it can only be opened by private key. No one other than web-server can have private key so no one can do anything.
11. Web-Server unlocks symmetric key using private key and keeps it for our conversation.
12. Browser locks secret data using symmetric key and send it to web-server. No one other than browser/web-server has symmetric key so no one else can see data in between.
13. Web-server unlocks secret data using symmetric key.
14. Web-server locks secret data using symmetric key and sends to browser. No one other than browser/web-server has symmetric key so no one can see data in between.
15. Browser unlocks secret data using symmetric key.
16. We continue sending secret data using symmetric key until browser decided to end conversation.
17. I decide to end conversation by closing https browser session.
18. Next time when browser wants to contact web-server again it will start with step 3. It will check web-server's validity again and it will generate new symmetric key.

Step 3-4 is called handshake or Hello (Step 3 -- Client Hello, Step 4 -- Server Hello)
Step 5-6-7 is part of certificate exchange
Step 9-10-11-12-13-14-15-16--17 is part of data exchange over https


Few questions before I close this blog
1. Why are we not using asymmetric keys to exchange data?
Ans: Working with asymmetric key is very CPU intensive job and so we would like to avoid it. That is why we have concept of symmetric keys over https

2. Why can't we use same symmetric key every time?
Ans: Its not safe to store symmetric key in your laptop anywhere. It could be a big threat.



In my next blog I will show how we can generate asymmetric keys and get it signed by CA. 

2 comments:

  1. Very nicely explained

    ReplyDelete
  2. very well explained waiting for your next blog about generating asymmetric keys and get it signed by CA.

    ReplyDelete