Skip to main content

Cryptography


Cryptography is the fundamental technology underlying lots of hot topics nowadays, such as security of IoT system, or blockchain and its primary application - cryptocurrency (Bitcoin is one of them). Hence, a basic understanding of what problems are the classic cryptography are trying to solve is vital to get a genuine of appreciation of their today's trending applications.
I tried to fix that on myself, by taking a Cryptography course on Couresa. It is an excellent course, taught by Professor Dan Boneh, from Standford University. And it is free! It has everything you need on Cryptography, probably a little bit too much for most of us. I strongly encourage you to take a look at the course if you really want to understand what security means technically.
Below are the notes I took during the course. I'm not sure how useful it can be for others since by simpling looking at the notes the chance to understand the Cryptography is slim - you'll have to really have to think it through very hard.

Overview

Security Goal

  • Confidentiality : data is kept secret
  • Integrity : data isn't modified
  • Message Authentication : the sender of the message is authentic and the message itself isn't modified
  • nonrepudiation : sender of the message can't deny the creation of the message
For example: Alice send message to Bob
  • Confidentiality means only Alice and Bob understand the message
  • Integrity mean Bob can be sure the message haven't been modified
  • Authentication Bob is sure the message is send from Alice and message isn't modified
  • Nonrepudiation Alice can't deny it's she create the message
  • Confidentiality use Symmetric Cipher or less common Symmetric cipher
  • Integrity use MAC
  • Authentication use digital signature and MAC
  • Nonrepudiation use digital signature

Symmetric Cipher

  • Provide Confidentiality
  • Def enc(plaintext, key) -> cyphered text; dec(cypered_text, key) -> plain text
  • Stream Cipher
    • RC-4
  • Block Cypher
    • DES,3-DES
    • AES

MAC

  • Provide Integrity and Authentication
  • Base on either Symmetric Cipher or Hash

Asymmetric Cipher, (G,E,D)

  • Provide Confidentiality (but mainly for key exchange, not for long message) and Nonrepudiation
  • Def G: generate key pair, (sk, pk) E: enc(plaintext, public key) -> cyphered text; D: dec(cyphered text, private key) -> plain text
  • RSA <- factorization="" integer="" li="">
  • ElGamal <- discrete="" li="" logarithms="">
  • Ellie Curve <- curve="" ellie="" li="">

Digital Signature

  • Provide Nonrepudiation
  • Base on Asymmetric Cipher Or DSA
  • Theory: sender sign the message with it's private key, so it can't deny the creation of the message because only she has the private key.
  • Notes: private/security key is used to sign (or encode), public key is used to verify. Compared with encryption, public key is used encrypt, private key is used to decrypt.
sign with private key, decrypt with public key.

Key Establishment

  • Certificate
  • PKI

Stream Ciphers

  • Cipher (K,M,C)
    • E(k,m) = k XOR m
    • D(k,c) = k XOR c
  • One Time Pad (OPT) : a stream cypher for which
    1. key stream k1,k2,k3.. is random generated
    2. key stream bit ki is used only once
    3. key stream is known to legitimate parties only
    OPT is perfect secure, but requires the key stream generator to be true random and len(k) >= len(m), so it is hard to use in practice.
  • Pseudorandom Generators (PRG) Making OPT practical, by replace true random with pseudorandom. E(k,m) = m XOR G(k) D(k,c) = c XOR G(k) where G : {0,1} s size -> {0,1} n size
  • The Stream Cipher still alive - RC4 link

Block Cyphers

  • Confusion and Diffusion
    • Confusion: obscure key and value
    • Diffusion: one plaintext spread over all the cipher text -> 1 bit change in plain text result in big change in cipher text
  • Operate on a fixed block size, e.g 64bits, 128bits(AES)
  • Modes
    • Block Modes : making same block of plaintext output different cyphertext
      • ECB (Electronic Code Book)
      • CBC (Cipher Blocking Chain)
      • CFB (Cipher Feedback)
      • OFB (Output Feedback)
      • CTR (Counter)
      • OCB (Offset Codebook)
      • GCM
    • Padding Modes
      • PKCS7
  • DES, considered not secure nowadays.
    • PRF, PRP
    • Feistel network
    • 3-round Feistel network
  • AES, DES replacement
    • To counter the plain text attack
      • CBC model, using random Initialization Vector,
      • Random ctr mode
  • Example: DES, JCA(Java Cryptography Architecture) API
// sender: 
// 1. generate the key
secretKey = KeyGenerator.getInstance("DES").generateKey();
// 2. encrypt with mode
Cipher encryptor = Cipher.getInstance("DES/CBC/PKCS5Padding");
encryptor.init(Cipher.ENCRYPT_MODE, secretKey);
// 3. share the key
keyString = Base64.encodeToString(secretKey.getEncoded());
// send the Base64 encoded shared key, the receiver will be able 
// to reconstruct the shared key and use it to decrypt

// receiver:
// 1. get key, encrypted data and other encrypt specific data
// 2. decrypt

Message Integrity

  • code or data wasn't be modified (secure boot or disk encryption)

Definition

  • Definition : MAC I = (S,V) over (Key,Message,Tag)
    • S(k, m) -> t, [m || t] is sent
    • V(k, [m, t]) -> OK?
  • Secure MAC Attacker can't produce valid tag for new message.
  • Important concept and properties
    1. Cryptographic checksum A MAC generates a cryptographically secure authentication tag for a given message.
    2. Symmetric MACs are based on secret symmetric keys. The signing and verifying parties must share a secret key.
    3. Arbitrary message size MACs accept messages of arbitrary length.
    4. Fixed output length MACs generate fixed-size authentication tags.
    5. Message integrity MACs provide message integrity: Any manipulations of a message during transit will be detected by the receiver.
    6. Message authentication The receiving party is assured of the origin of the message.
    7. No nonrepudiation Since MACs are based on symmetric principles, they do not provide nonrepudiation.
  • Secure PRF (Pseudorandom Function) -> Secure MAC
    • AES is a secure PRF so it is secure MAC but works only on 16 bytes
    • Given PRF for short message construct PRF for long message, or convert Small MAC to Big MAC, there are two way doing this
      • CBC-MAC - build it from block cypher
      • HMAC - build it from Hash function

CBC-MAC, NMAC, PMAC - all are PRF

Hash Collision Resistance

  • Let H: M -> T to be a hash function a collision is H(m0) = H(m1) and m0 != m1
  • Requirements/features:
    • preimage resistance, or One-wayness Can't get the message from the hash
    • second image resistance, or Collision Resistance: For a hash function H, the probability that attacker can find a m1 and m2 so that H(m1) = H(m2) is negligible.
  • MAC from collision Resistance Let I = (S,V) be MAC for short message over (K, M, T), e.g AES Let H : Mbig -> M, called Digest/hash algorithm, e.g md5/sha1/sha256/etc.. Define Ibig = (Sbig, Vbig) over (K, Mbig, T) as Sbig(k, m) = S(k, H(m)), Vbig(k, m, t) = V(k, H(m), t)
    If I is secure MAC and H is collision resistant, then Ibig is secure.
    Example: let H = SHA256, S = AES ; then Sbig(k,m) = AES(k, SHA256(m))
    Note: this is not HMAC!!
  • Properties
    • block size : Usually the hash function is blocked based, the hash is base on block, and the output of one block will feed into another stage as the input. see Merkle–Damg√•rd construction. Remember, 512.
    • max message size : Remember, 2^64-1
    • digest size : the output of hash is fixed for certain algorithm (e.g 256 for SHA256).

Merlke-Damgard Paradigm

  • Given C.R function for short message, build C.R function for long message. given compression function h : T * X -> T get H: XBig->T Chain variables:
  H0 = h(IV, m0) 
  Hi = h(Hi-1, mi)
  • Build compression function from block cypher. Davies-Meyer compression function h(H,m) = E(m, H) XOR H Use the message as the key, hash value (from previous output) as the value to be encrypted. The output is the same size as the input hash value.

SHA256

  • A Collision Resistance Hash building using
    • Merlke-Damgard Paradigm
    • Davies-Meyer compression function
    • Block cipher: SHACAL-2
    block size: 512 digest size: 256
  • Very useful!
    • Used by HMACSHA256 to build a Hash based MAC
    • Used to RSA
    • Used by EC

HMAC - a way of building secure MAC from Hash

  • The obvious but not secure way of building the MAC using hash H(k||x) or H(x||k)
  • HMAC - a secure way of building MAC from Hash H (k XOR opad || H( k XOR ipad) || m)
    k is expanded from the symmetric key, with 0 on the left, until the block size is reached, i.e 0000...key_s ipad is a repetition of 0011 0110, until the block size is reached opad is a repetition of 0101 1100, until the block size is reached
    e.g from HmacSHA256 the block size is 512.
    • HMAC is not acronym referring a generic way building hash from MAC but the specific way as described above. The difference is in what Hash function is used, e.g when the hash is SHA256 it is called HmacSHA256. similar, we have HmacSHA224,...
    • key size is defined the Hash function
    • tag size is defined by the digest size of the Hash function.
    • more parameters maybe required depend on the Hash function, e.g SHA256 requires an random IV, this IV will be needed in the verification side.
  • HmacSHA256
  • Despite Merlke-Damgard hash collision resistance and can work on long message but it is not safe to be used as a MAC directly. We need a PRF (e.g) and that is called Hash-MAC or HAMC.

Authenticated Encryption

  • MAC and then Encrypt (SSL)
  • Encrypt and then MAC (IPsec)
    • see android keystore client implementation
  • Encrypt and MAC (SSH) - Not safe
  • Example: TLS

TLS

  • Use mac and then encrypt
  • unidirectional key : both client and server has two keys. use symmetric key!
    • browser_to_sever_key_mac
    • browser_to_sever_key_enc
    • server_to_browser_key_mac
    • server_to_browser_key_enc
  • CBC-AES128, HMAC-SHA1
  • browser -> server
    • record == [ header || data || tag || pad ]
    • tag <- browser_to_sever_key_mac="" ctr_b_s="" data="" header="" li="" sign="">
    • pad [header||data||tag] to AES block
    • encrypt: CBC encrypt with browser_to_sever_key_enc and new random IV
    • prepend header
  • server -> browser
    • decrypt with browser_to_sever_key_enc
    • check pad
    • check tag on [ctr_b_s || header || data]

Key derivation

  • generate many keys from the the source key
  • if Source Key is uniformed distributed, use KDF
  • if not, extract and then expand
    • extract: making it indistinguishable from random distribution, using salt
    • expand : use KDF
  • HKDF : HMAC based KDF
    • extract: k <- hmac="" li="" salt="" source_key="">
  • PBKDF : Password Based KDF
    • Don't use HKDF to generate key from password, since password has low entropy
    • standard: PKCS#5(PBKDF1)
      • iterate c times using H(salt|pwd), slow down each time

Problem with Symmetric Encryption

  • Shared key exchange
  • Number of keys : A key is need between any 2 of N
  • No protection against cheating by Alice or Bob since that have same key

Usage/Purpose of Public-Key Algorithms:

It is important to realize that Public-Key Algorithms is not just for Encryption. Actually, since Public-Key Encryption is very slow, it is rarely used directly to encrypt large data, instead, it is often used to encrypt/exchange the symmetric key that will be used to encrypt the actual message.
  • Key Establishment e.g Diffie–Hellman key exchange (DHKE) or RSA key transport protocols.
  • Encryption Encrypt messages using as RSA or Elgamal.
  • Nonrepudiation Providing nonrepudiation and message integrity can be realized with digital signature algorithms, e.g., RSA, DSA or ECDSA.
  • Identification We can identify entities using challenge-and-response pro- tocols together with digital signatures.

3 Type of Public-Key Algorithm Families

  • Integer-Factorization Schemes RSA
  • Discrete Logarithm Schemes Diffie–Hellman key exchange, Elgamal encryption or the Digital Signature Algorithm (DSA).
  • Elliptic Curve (EC) Schemes Elliptic Curve Diffie–Hellman key exchange (ECDH) and the Elliptic Curve Digital Signature Algorithm (ECDSA). ECDSA used by bitcoin

Basic Key Exchange

Solutions

  • online Trust 3rd Party
  • Deffie-Hellman Protocol
  • Public-Key Encryption The key exchange problem triggered the start of public cryptography

online Trust 3rd Party

  • Toy protocol: When alice when to share a key with Bob, she ask TTP for a shared key. TTP will send alice [ enc(shared_key, alice_privateKey), enc(shared_key, bob_privateKey)]. The late message will be send to Bob by Alice. So, Alice and Bob set up a shared key
  • insecure to replay attack

Diffie-Hellman Key Exchange

Public-key encryption

Number Theory

  • gcd : gcd(x,y) = ax + by;
  • Modular invertible x is invertible in Zp, if there is a y in Zp making z*y = 1 in Zp Lemma: x is invertible if gcd(x,P) = 1
    (Zp)* represents all invertible in Zp, e.g (Z12)* = [1,3,5,7,11]
  • Fermat's theorem Let p be a prime , V x <- 1="" an="" be="" can="" if="" in="" is="" number="" p-1="" p="" prime.="" test="" to="" used="" x="" zp="">
  • Dlog
    • Elliptic curve group

Public-Key Encryption

  • Asymmetric, public key (G,E,D) G: generate key pair, (sk, pk) E: enc(plaintext, public key) -> cyphered text; D: dec(cyphered text, private key) -> plain text
  • Using TDF
  • Using Diffie-Hellman: ElGamal

Trapdoor Function (TDF), or one-way function

  • TDF: a triple algorithms that X->Y, (G,F,F')
    • G: random algorithm generate key pair(sk, pk)
    • F: F(pk, X) -> Y
    • F': F'(sk, Y) -> X
  • Secure TDF, F is one way function, that is to say without sk, the probability of invert X correctly is negligible.

Public-Key Encryption using Secure TDF

  • we need three components, Iso standard
    • (G, F, F') : A secure TDF
    • (E, D) : A symmetric authenticated encryption defined over (K,M, C)
    • H: X -> k : A Hash
  • For the public encryption, we need (G,E,D)
    • G: Use the same G in TDF
    • E(pk, m) -> c
    • D(sk, c) -> m
  • E(pk, m)
    • random pick x from X
    • y = F(pk,x)
    • k = H(x) , c = E(k, m)
    • output [y, c]
 | F(pk,x)    |        E(H(x), m)                      |
     header              body
  • D(sk, [y,c])
    • x = F'(sk, y) , k = H(x), m = D(k, c)
  • Don't use TDF directly to the message! It is not secure. Instead, the TDF is used to encode/trapdoor the source of symmetric key that will be used to encrypt the message.

RSA

RSA in Practice, PKCS

  • used in practice, different from what we described above
  • key difference:
  key   ->   preprocess   ->   RSA
  128bits     2048bits
  • RSA : Java API
// sender
// 1. generate private and public key
rasKeyG = KeyPairGenerator.getInstance("RSA");
privateKey = rasKeyG.getPrivate(); // secure you private key
publicKey =  rasKeyG.getPublic();

// 2.share the public key, dont share the public key directly but share
// some parameters( e.g the modulus and exponent for RSA), so that the
// receiver can to reconstruct the public key.
RSAPublicKeySpec rsaPublicKeySpec =
             MM,CX,           (RSAPublicKeySpec)keyFactory.getKeySpec(
              publicKey, 
              Class.forName("java.security.spec.RSAPublicKeySpec"));
modulus = rsaPublicKeySpec.getModulus();
exponent =rsaPublicKeySpec.getPublicExponent();

// receiver
// 1. reconstruct the public key
// 2. generate the data and use the public key to encrypt the data
// 3. send the encrypted data and the other side will decrypt it using 
//    its private key

Public-Key Encryption using Diffie-Hallman

Public-Key Encryption Application

  • interactive
    • Used to exchange a symmetric key
  • non-interactive
    • Encrypted file system, and share the access enc. the message use enc. the sym. key with symmetric encryption public-key encryption | E(pk2, k) | <-- a="" b="" e="" enc.="" es="" k="" li="" m="" pk1="" pk="" s="" with="">

Popular posts from this blog

Android Camera2 API Explained

Compared with the old camera API, the Camera2 API introduced in the L is a lot more complex: more than ten classes are involved, calls (almost always) are asynchronized, plus lots of capture controls and meta data that you feel confused about.

No worries. Let me help you out. Whenever facing a complex system need a little bit effort to understand, I usually turns to the UML class diagram to capture the big picture.

So, here is the class diagram for Camera2 API.




You are encouraged to read this Android document first and then come back to this article, with your questions. I'll expand what is said there, and list the typical steps of using camera2 API. 

1. Start from CameraManager. We use it to iterate all the cameras that are available in the system, each with a designated cameraId. Using the cameraId, we can get the properties of the specified camera device. Those properties are represented by class CameraCharacteristics. Things like "is it front or back camera", "outpu…

Java Collections Framework Cheat Sheet

Java Collections Framework (JCF) implements the Abstract Data Type  for Java platform. Every serious Java programmer should familiar himself on this topic and be able to choose the right class for specific need.  A thorough introduction to JCF is not the target of this small article and to achieve that goal you can start with this excellent tutorial . 

Instead, I'd like to
1) Provide an overview of JCF's classes ,   2) Provide a cheat sheet you can post in your cubicel for daily reference, 3) Underline the relationship between JCF's implementation and the data structure and algorithm you learned in your undergraduate course

With these goals in mind, I came up following diagram - Java Collection Cheat Sheet. You can click it to zoom in. There is no necessity for more explanation once your familiar with UML class diagram and have a basic understanding of common data structures.


Android Security: An Overview Of Application Sandbox

The Problem: Define a policy to control how various clients can access different resources. A solution: Each resource has an owner and belongs to a group.Each client has an owner but can belongs to multiple groups.Each resource has a mode stating the access permissions allowed for its owner, group members and others, respectively. In the context of operating system, or Linux specifically, the resources can be files, sockets, etc; the clients are actually processes; and we have three access permissions:read, write and execute.