peer-to-peer key-value database in rust
- Rust (latest stable version)
- Cargo (comes with Rust)
- OpenSSL development libraries (for cryptography support)
-
Clone the repository:
git clone https://github.com/dinxsh/prustdb.git cd p2p_key_db
-
Build the project:
cargo build --release
-
Run tests to ensure everything is working correctly:
cargo test
-
Start a node:
cargo run --release -- --port 8000 --data-dir /path/to/data
-
Connect to an existing network:
cargo run --release -- --port 8001 --peer 127.0.0.1:8000 --data-dir /path/to/data
-
Use the API to interact with the database:
use prustdb_client::Client; async fn example() -> Result<(), Box<dyn std::error::Error>> { let client = Client::connect("127.0.0.1:8000").await?; // Set a value client.set("key", "value").await?; // Get a value let value = client.get("key").await?; println!("Value: {:?}", value); Ok(()) }
Our P2P Key-Value Database is built on a distributed hash table (DHT) architecture, similar to Kademlia. Here's a high-level overview:
-
Node Identification: Each node in the network is assigned a unique ID using a cryptographically secure random number generator.
-
Key-Value Storage: Data is stored as key-value pairs, distributed across the network based on the proximity of the key to node IDs.
-
Routing: Nodes maintain a routing table of known peers, organized into k-buckets based on the XOR distance between node IDs.
-
Lookup Protocol: To find a key, nodes perform iterative lookups, querying progressively closer nodes until the key is found or the closest nodes are reached.
-
Data Replication: Key-value pairs are replicated across multiple nodes using a configurable replication factor to ensure data availability and fault tolerance.
-
Network Joining: New nodes join the network by bootstrapping through a known peer, and then performing lookups for their ID to populate their routing table.
-
Load Balancing: The DHT naturally distributes data and lookup load across the network, with additional active balancing mechanisms for hotspots.
-
Consistency: We implement tunable consistency levels, from eventual consistency to strong consistency, allowing users to choose the appropriate trade-off between performance and consistency for their use case.
-
Caching: An intelligent caching layer improves read performance for frequently accessed data.
-
Compression: Data is compressed before storage and transmission to reduce storage requirements and network usage.
PrustDB is designed for high performance:
- Asynchronous I/O for efficient resource utilization
- Custom memory allocator for reduced memory fragmentation
- Bloom filters for rapid non-existence proofs
- Optimized data structures for fast lookups and insertions
- Benchmarking suite for continuous performance monitoring
Security is a top priority for PrustDB:
- End-to-end encryption for all data transmissions
- Authentication and authorization for all nodes and clients
- Regular security audits and penetration testing
- Sandboxing of untrusted code execution
- Detailed security documentation and best practices
We welcome contributions from the community! Please check out our Contributing Guidelines for details on how to get started, our code of conduct, and the process for submitting pull requests.
This project is licensed under the MIT License - see the LICENSE file for details.