Security best practices: Data integrity and authenticity
Certified variables
Security concern
ICP offers three modes of operations for canisters, update
, query
and composite_query
. For the sake of simplicity, we will club composite_query
under queries for the rest of this section. For more information, view the detailed overview between update
and query
calls. As explained in the overview, update calls are slow and expensive but provide integrity guarantees as their response include a threshold signature signed by the subnet.
On the other hand, query calls are fast since a single replica formulates the response but there is no integrity guarantee, since the response can be manipulated by a single replica or boundary node. For example, if the NNS dapp fetches proposal information from the governance canister via query calls and the responding node is malicious, it can mask an ill intentioned proposal that causes irrevocable damage as innocuous by modifying the proposal payload in the response and mislead voters into voting yes. Another consequence of query calls is that users can't rely on canister_inspect_message as a guard. This makes query calls, in it's raw form, unfit to serve data for security critical applications.
Using certified variables for secure queries
In certain use cases, there is a third option whereby query results can return data that has been certified by the subnet in an earlier update call. This is the concept of certified data and it requires changes to the update call to create the certification, query call to return the certificate and front-end to verify the certificate. Using certified data provides the best of both worlds with query-like response times and update-like certified responses. This forms the core of certified variables.
Some examples of certified variables are asset certification in Internet Identity, NNS dapp, or the canister signature implementation in Internet Identity.
Certified variables is an advanced feature which require careful implementation of authenticated data structures and verification on the canister and client side respectively. If the client doesn't require fast response times, call the query method as an update call (replicated query). The response would be certified by the subnet and a single malicious or boundary node can't modify the response.
ICP also provides replica signed queries, where query responses are signed by the answering replica node, however it doesn't have the same security guarantees an update
call and only protects from malicious boundary nodes. Replica signed queries are enabled by default on both agent-rs
and agent-js
. Read more about the feature.
What is certified data?
Aside update calls, the subnet certifies (creates a threshold signature) a part of the canister data every round. This is stored in the state tree under the label certified_data
. However, since it's certified every round, the amount of data that can be stored in certified_data
is limited to 32 bytes. Hence, when you modify the state of your canister during an update call, if you can convert the state into a unique representation that can fit into 32 bytes, you can store it under certified_data
and it will be certified. Naturally, this can be done by computing a hash of the data structure of the canister state. This is also why certified variables is difficult to implement. Depending on your data structures, you will need to develop a different kind of hashing function.
Subsequent query calls can return the data as-is, including the signature on the certified_data
, which the front-end can verify with the IC root public key. This means that data aggregation or other calculations can't be done in query calls as there would be no way to produce a signature over that newly created data. There are two workarounds, either this data is precomputed in the update call or all raw data is sent to the front-end which verifies it and does the calculations. Combining these features, a canister should be able to certify a variable in a query response with this design.
On a high level, in your canister,
- Choose an authenticated data structure like Merkle trees to store a value in canister memory.
- In the update call,
- Perform the computation and store the result in the Merkle tree.
- The lookup path for the result must act as it's
key
. Ideally thiskey
should be the parameters provided by the caller in the query method. - Recompute the Merkle proof (
root_hash
) - Store the
root_hash
as the canister's certified data. - Return the
key
as response.
- In the query call,
- Fetch the result from the Merkle structure using the query parameters as the lookup path.
- Fetch the current
certified_data
for the canister. - Compute the witness for the result using the same lookup path. The Merkle witness provides proof of inclusion that the requested result exists in the Merkle tree under the given path.
- Return
(result, certified_data, witness)
as the response.
The rest of the section shows an example canister, which can serve a certified response for a query
using certified_data
which is verified in the frond-end. The examples are written in Rust and Motoko but the overall design can be implemented in other languages.
Building a canister with certified variables
Let's consider the following canister interface:
type User = record {
name: text;
age: nat8;
};
type CertifiedUser = record {
user : User;
certificate : blob;
witness : blob;
};
service : {
"set_user": (User) -> (nat64);
"get_user": (nat64) -> (CertifiedUser) query;
}
The canister exposes the following service:
- set_user: The caller provides a
User
object to the canister. The canister records it and serves a correspondingindex
for the entry as the response. Sincecertified_data
can only store 32 bytes of data, it uses a specialized data structure fromic_certified_map
to store theUser
data.- The data structure internally stores the data in a
HashTree
(or Merkle tree) and records theroot_hash
of the data structure in thecertified_data
, which is 32 bytes. - The
root_hash
cryptographically guarantees that only one tree can correspond to that hash. Theroot_hash
is also referred as the Merkle proof.
- The data structure internally stores the data in a
- get_user: The caller provides a
index: nat64
to the canister and gets a certified response for the correspondingUser
. TheCertifiedUser
response must have the following structure for verifying the response:- user: The actual response.
- certificate: The payload for verifying the signature on the
certified_data
. ICP provides the system APIdata_certificate()
for this. - witness: Allows for the final verification of the response to be completed with the requested input and
certified_data
.
You can find an example implementation of the canister below.
- Rust
- Motoko
use candid::CandidType;
use ic_certified_map::HashTree;
use ic_certified_map::{leaf_hash, AsHashTree, Hash, RbTree};
use serde::{Deserialize, Serialize};
use std::borrow::Cow;
use std::cell::Cell;
use std::cell::RefCell;
#[derive(CandidType, Serialize, Deserialize, Clone)]
struct User {
name: String,
age: u8,
}
impl AsHashTree for User {
fn root_hash(&self) -> Hash {
let user_serialized = serde_cbor::to_vec(&self).unwrap();
leaf_hash(&user_serialized[..])
}
fn as_hash_tree(&self) -> HashTree<'_> {
HashTree::Leaf(Cow::from(serde_cbor::to_vec(&self).unwrap()))
}
}
#[derive(CandidType)]
struct CertifiedUser {
user: User,
certificate: Vec<u8>,
witness: Vec<u8>,
}
thread_local! {
static INDEX : Cell<u64> = Cell::new(0);
static TREE: RefCell<RbTree<&'static str, RbTree<[u8; 8], User>>> = RefCell::new(RbTree::new());
}
#[ic_cdk::update]
fn set_user(user: User) -> u64 {
let index = INDEX.with(|index| {
let count = index.get() + 1;
index.set(count);
count
});
TREE.with_borrow_mut(|tree| {
match tree.get(b"user") {
Some(_) => {
tree.modify(b"user", |inner| {
inner.insert(index.to_be_bytes(), user);
});
}
None => {
let mut inner = RbTree::new();
inner.insert(index.to_be_bytes(), user);
tree.insert("user", inner);
}
}
ic_cdk::api::set_certified_data(&tree.root_hash());
});
index
}
#[ic_cdk::query]
fn get_user(index: u64) -> CertifiedUser {
let certificate = ic_cdk::api::data_certificate().expect("No data certificate available");
TREE.with_borrow(|tree| {
let user = match tree.get(b"user") {
Some(inner) => {
let user = inner.get(&index.to_be_bytes()[..]).expect("User not found");
user.to_owned()
}
None => {
panic!("Tree isn't initialized");
}
};
let mut witness = vec![];
let mut witness_serializer = serde_cbor::Serializer::new(&mut witness);
let _ = witness_serializer.self_describe();
tree.nested_witness(b"user", |inner| inner.witness(&index.to_be_bytes()[..]))
.serialize(&mut witness_serializer)
.unwrap();
CertifiedUser {
user,
certificate,
witness,
}
})
}
import CertifiedData "mo:base/CertifiedData";
import Blob "mo:base/Blob";
import Nat8 "mo:base/Nat8";
import Debug "mo:base/Debug";
import Text "mo:base/Text";
import Nat64 "mo:base/Nat64";
import Array "mo:base/Array";
import CertTree "mo:ic-certification/CertTree";
import CV "mo:cbor/Value";
import CborEncoder "mo:cbor/Encoder";
import CborDecoder "mo:cbor/Decoder";
actor CertifiedVariable {
type User = {
name : Text;
age : Nat8;
};
type CertifiedUser = {
user : User;
certificate : Blob;
witness : Blob;
};
stable var count : Nat64 = 0;
stable let cert_store : CertTree.Store = CertTree.newStore();
let ct = CertTree.Ops(cert_store);
public func set_user(user : User) : async Nat64 {
count += 1;
let path : [Blob] = [Text.encodeUtf8("user"), blobOfNat64(count)];
ct.put(path, encodeUser(user));
ct.setCertifiedData();
return count;
};
public query func get_user(index : Nat64) : async CertifiedUser {
let certificate = switch (CertifiedData.getCertificate()) {
case (?certificate) {
certificate;
};
case (null) {
Debug.trap("Certified data not set");
};
};
let path : [Blob] = [Text.encodeUtf8("user"), blobOfNat64(index)];
let value = switch (ct.lookup(path)) {
case (?value) {
value;
};
case (null) {
Debug.trap("Lookup failed");
};
};
let user : User = decodeUser(value);
let witness = ct.encodeWitness(ct.reveal(path));
let certifiedUser : CertifiedUser = {
certificate = certificate;
witness = witness;
user = user;
};
return certifiedUser;
};
func encodeUser(user : User) : Blob {
let bytes : CV.Value = #majorType5([
(#majorType3("name"), #majorType3(user.name)),
(#majorType3("age"), #majorType0(Nat64.fromNat(Nat8.toNat(user.age)))),
]);
let #ok(encoded_user) = CborEncoder.encode(bytes);
return Blob.fromArray(encoded_user);
};
func decodeUser(bytes : Blob) : User {
let #ok(#majorType5(map)) = CborDecoder.decode(bytes);
let name_tag = Array.find<(CV.Value, CV.Value)>(map, func x = x.0 == #majorType3("name"));
let age_tag = Array.find<(CV.Value, CV.Value)>(map, func x = x.0 == #majorType3("age"));
let name = switch (name_tag) {
case (?name_value) {
let #majorType3(name) = name_value.1;
name;
};
case (null) {
Debug.trap("Decoding failed for name");
};
};
let age = switch (age_tag) {
case (?age_value) {
let #majorType0(age) = age_value.1;
Nat8.fromNat(Nat64.toNat(age));
};
case (null) {
Debug.trap("Decoding failed for age");
};
};
return {
name = name;
age = age;
};
};
func blobOfNat64(n : Nat64) : Blob {
let byteMask : Nat64 = 0xff;
func byte(x : Nat64) : Nat8 {
Nat8.fromNat(Nat64.toNat(x));
};
Blob.fromArray([
byte(((byteMask << 56) & n) >> 56),
byte(((byteMask << 48) & n) >> 48),
byte(((byteMask << 40) & n) >> 40),
byte(((byteMask << 32) & n) >> 32),
byte(((byteMask << 24) & n) >> 24),
byte(((byteMask << 16) & n) >> 16),
byte(((byteMask << 8) & n) >> 8),
byte(((byteMask << 0) & n) >> 0),
]);
};
};
Verifying certified variables
Once you have the response CertifiedUser
, for the integrity guarantee, the front-end must verify the certification in the response. This is broken down into several steps implemented in the Rust and JavaScript example below.
The example has some extra steps to setup the canister with some User
data before verification. You can ignore the section marked between // ==== START of canister data setup
and // ==== END of canister data setup
- Verify the IC certificate: Recompute the
root_hash
ofcertificate.tree
(pruned state tree with the canister'scertified_data
) and verify thecertificate.signature
withroot_hash
as the message,certificate.delegation
, and the ICroot_key
as the public key. This confirms that the signature is valid for the current state tree. - Validate that the response is not stale by verifying the time at
/time
incertificate.tree
is less than a certain delta of current time. The recommended delta is 5 minutes but should be adapted to the use case. - Recompute the
root_hash
of the witness and verify equality with thecertified_data
. Thecertified_data
can be obtained fromcertificate.tree
under the path/canister/<canister_id>/certified_data
. - Check if query parameters are in the witness. In this example, the lookup path is
/user/<index>
and should be present in the witness. - Validate if the value found in
/user/<index>
matchesuser
from the response. - If all of the previous steps succeed, return
user
as the valid response.
- Rust
- JavaScript
use arbitrary::{Arbitrary, Unstructured};
use candid::Encode;
use candid::Principal;
use candid::{CandidType, Decode, Deserialize};
use futures::future::join_all;
use ic_agent::identity::AnonymousIdentity;
use ic_agent::Agent;
use ic_certificate_verification::validate_certificate_time;
use ic_certificate_verification::VerifyCertificate;
use ic_certification::hash_tree::HashTree;
use ic_certification::{Certificate, LookupResult};
use rand::prelude::*;
use serde_cbor::Deserializer;
use std::time::{SystemTime, UNIX_EPOCH};
#[derive(CandidType, Deserialize, Debug, PartialEq, Eq, Arbitrary)]
struct User {
name: String,
age: u8,
}
#[derive(CandidType, Deserialize)]
struct CertifiedUser {
user: User,
certificate: Vec<u8>,
witness: Vec<u8>,
}
static URL: &str = "http://localhost:41749";
static CANISTER: &str = "a3shf-5eaaa-aaaaa-qaafa-cai";
const MAX_CERT_TIME_OFFSET_NS: u128 = 300_000_000_000; // 5 min
const MAX_CALLS: usize = 10;
#[tokio::main]
async fn main() {
let agent = Agent::builder()
.with_url(URL)
.with_identity(AnonymousIdentity)
.build()
.expect("Unable to create agent");
// This should be done only in demo environments.
// When interacting with mainnet, hardcode the root_key.
agent
.fetch_root_key()
.await
.expect("Unable to fetch root key");
let root_key = agent.read_root_key();
let canister_id = Principal::from_text(CANISTER).unwrap();
// ==== START of canister data setup
let mut rng = rand::thread_rng();
// Make MAX_CALLS to set_user
let mut get_user_calls = Vec::new();
for _ in 0..MAX_CALLS {
let bytes: [u8; 16] = rng.gen();
let mut u = Unstructured::new(&bytes[..]);
let temp_user = User::arbitrary(&mut u).unwrap();
println!("Calling set_user with {:?}", temp_user);
let response = agent
.update(&canister_id, "set_user")
.with_effective_canister_id(canister_id)
.with_arg(Encode!(&temp_user).unwrap())
.call_and_wait();
get_user_calls.push(response);
}
let results: Vec<u64> = join_all(get_user_calls)
.await
.into_iter()
.map(|result| {
Decode!(
result
.expect("Query call get_user failed")
.as_slice(),
u64
)
.unwrap()
})
.collect();
// From response indexes, choose a random index for get_user
let index: usize = rng.gen();
let index: u64 = *results.get(index % MAX_CALLS).unwrap();
// ==== END of canister data setup
println!("Fetching index {:?}", index);
let query_response = agent
.query(&canister_id, "get_user")
.with_effective_canister_id(canister_id)
.with_arg(Encode!(&index).unwrap())
.call()
.await
.expect("Unable to call query call get_user");
let certified_user = Decode!(&query_response, CertifiedUser).unwrap();
let mut deserializer = Deserializer::from_slice(&certified_user.certificate);
let certificate: Certificate = serde::de::Deserialize::deserialize(&mut deserializer).unwrap();
let start = SystemTime::now();
let current_time = start
.duration_since(UNIX_EPOCH)
.expect("Time went backwards")
.as_nanos();
// Step 1: Check if signature in the certificate can be validated with the
// root_hash of the tree in certificate as message and root_key as public_key
let verification_result = certificate.verify(canister_id.as_slice(), &root_key[..]);
println!(
"Step 1: Digest match & Signature verification: {:?}",
verification_result
);
// Step 2: Check if the response is not stale with the given time offset MAX_CERT_TIME_OFFSET_NS.
let time_verification_result =
validate_certificate_time(&certificate, ¤t_time, &MAX_CERT_TIME_OFFSET_NS);
println!("Step 2: Time skew: {:?}", time_verification_result);
// Step 3: Check if witness root_hash matches the certified_data
let lookup_result =
certificate
.tree
.lookup_path([b"canister", canister_id.as_slice(), b"certified_data"]);
let certified_data: [u8; 32] = match lookup_result {
LookupResult::Found(result) => result.try_into().unwrap(),
_ => panic!("Certified data not found"),
};
let mut deserializer = Deserializer::from_slice(&certified_user.witness);
let witness_decoded: HashTree<Vec<u8>> =
serde::de::Deserialize::deserialize(&mut deserializer).unwrap();
let witness_digest = witness_decoded.digest();
println!(
"Step 3: Witness digest matches certified data: {:?} ",
witness_digest == certified_data
);
// Step 4: Check if the query parameters are in the witness
let witness_lookup: User =
match witness_decoded.lookup_path([b"user", &index.to_be_bytes()[..]]) {
LookupResult::Found(result) => serde_cbor::from_slice(result).unwrap(),
_ => panic!("user {} not found", index),
};
// Step 5: Check if the data found in Witness matches the returned result from the query.
println!(
"Step 4 & Step 5: Witness data matches User value: {:?}",
witness_lookup == certified_user.user
);
// Step 6: Return the result
println!("Result: {:?}", certified_user.user);
}
import pkg from "@dfinity/agent";
const { Actor, HttpAgent, Certificate, blsVerify, Cbor, reconstruct, lookup_path } = pkg;
import { IDL } from "@dfinity/candid";
import { Principal } from "@dfinity/principal";
import fetch from "isomorphic-fetch";
import assert from "node:assert/strict";
const idlFactory = ({ IDL }) => {
const User = IDL.Record({ age: IDL.Nat8, name: IDL.Text });
const CertifiedUser = IDL.Record({
certificate: IDL.Vec(IDL.Nat8),
user: User,
witness: IDL.Vec(IDL.Nat8),
});
return IDL.Service({
get_user: IDL.Func([IDL.Nat64], [CertifiedUser], ["query"]),
set_user: IDL.Func([User], [IDL.Nat64], []),
});
};
const canisterId = Principal.fromText("a3shf-5eaaa-aaaaa-qaafa-cai");
const host = "http://localhost:35777";
start().await;
async function start() {
const agent = new HttpAgent({ fetch, host });
await agent.fetchRootKey();
const rootKey = agent.rootKey.buffer;
let dummyUser = { name: "test_user", age: 21 };
const actor = Actor.createActor(idlFactory, {
agent,
canisterId,
});
let index = await actor.set_user(dummyUser);
let certifiedUser = await actor.get_user(index);
await verifyCertificate(certifiedUser, index, rootKey, canisterId);
}
async function verifyCertificate(certifiedUser, index, rootKey, canisterId) {
const certificate = certifiedUser.certificate.buffer;
const witness = certifiedUser.witness.buffer;
const user = certifiedUser.user;
const cert = new Certificate(certificate, rootKey, canisterId, blsVerify);
// Step 1: Check if signature in the certificate can be validated with the
// root_hash of the tree in certificate as message and root_key as public_key
await cert.verify();
console.log("Certificate verification succeeded");
// Step 2: Check if the response is not stale with the given time offset of 5m.
const te = new TextEncoder();
const pathTime = [te.encode("time")];
const rawTime = cert.lookup(pathTime).value;
console.log("Time skew: ", verifyTime(rawTime));
// Step 3: Check if witness root_hash matches the certified_data
const pathData = [
te.encode("canister"),
canisterId.toUint8Array(),
te.encode("certified_data"),
];
const certifiedData = cert.lookup(pathData).value;
let witnessTree = Cbor.decode(witness);
let witnessRootHash = await reconstruct(witnessTree);
console.log(
"Verify CertifiedData matches witness root_hash: ",
certifiedData.buffer === witnessRootHash.buffer
);
// Step 4: Check if the query parameters are in the witness
const query_params = [te.encode("user"), bigEndian(index).buffer];
const witnessData = Cbor.decode(lookup_path(query_params, witnessTree).value);
console.log("Witness data: ", witnessData);
// Step 5: Check if the data found in Witness matches the returned result from the query.
assert.deepStrictEqual(witnessData, user, "Value matches response data");
// Step 6: Return the result
return user
}
function verifyTime(rawTime) {
const idlMessage = new Uint8Array([
...new TextEncoder().encode("DIDL\x00\x01\x7d"),
...new Uint8Array(rawTime),
]);
const decodedTime = IDL.decode([IDL.Nat], idlMessage)[0];
const time = Number(decodedTime) / 1e9;
const now = Date.now() / 1000;
const diff = Math.abs(time - now);
if (diff > 5) {
return false;
}
return true;
}
function bigEndian(n) {
let buf = new Uint8Array(8);
for (let i = 7; i >= 0; i--) {
buf[i] = Number(n & 0xffn);
n >>= 8n;
}
return buf;
}
Use HTTP asset certification and avoid serving your dapp through raw.icp0.io
Security concern
Dapps on ICP can use asset certification to make sure the HTTP assets delivered to the browser are authentic (i.e. threshold-signed by the subnet). If an app does not do asset certification, it can only be served insecurely through raw.icp0.io
, where no asset certification is checked. This is insecure since a single malicious node or boundary node can freely modify the assets delivered to the browser.
If an app is served through raw.icp0.io
in addition to icp0.io
, an adversary may trick users (phishing) into using the insecure raw.icp0.io
.
Recommendation
Only serve assets through
<canister-id>.icp0.io
where the boundary nodes enforce response verification on the served assets. Do not serve through<canister-id>.raw.icp0.io
.Serve assets using the asset canister, which creates asset certification automatically, or add the
ic-certificate
header including the asset certification as e.g. done in the NNS dapp and Internet Identity.Check in the canister’s
http_request
method if the request came through raw. If so, return an error and do not serve any assets.