RSA

邱云中 - 2024213716

实验基本情况

本项目实现了一套基于 C++ 的 RSA 密钥生成、加解密与签名验证流程。效果上，基本满足了 1 秒生成 RSA768 以内密钥的要求。实现上，没有使用线程的大数运算方法，而是参考标准库的设计思路构建了一套新的大数运算流程。算法优化上，参考了蒙哥马利蒙哥马利模乘算法论文，同时也参考了 OpenSSL 对于大数运算的设计思路。

实验感悟

通过本次实验，我更好地了解到了RSA 密码体系的算法实现。但与我最初构想不同的是，我发现自己大部分时间和大部分实验效果提升不在于加解密算法本身的优化，而是在于大数运算基础性能的优化。一些比较显而易见的问题，包括乘除法优化、素数筛、多进程等，是比较容易想到和实现的。像一些其他的问题：如大数存储方式、大数进制转换、运算结果缓存等，则是我在实验过程中不断碰壁和对于标准库的借鉴中领悟的。“保运算”其言也善，让我对于计算机中的“计算学”的魅力有了更深刻的领悟。

目录导览

模块	文件	说明
大整数核心	`BigInteger.h/.cpp`	32 位 limb 表示、加减乘除、除法和模运算、位操作、字符串转换
蒙哥马利运算	`Montgomery.h/.cpp`	蒙哥马利乘法/模幂、$R^2$ 预乘、域转换、位长判定
RSA 协议	`RSA.h/.cpp`	密钥生成、Miller–Rabin 素性检测、并行候选筛选、加解密与签名
并行乘法	`BigIntegerParallel.*`	Karatsuba 递归、`std::async` 并行乘法辅助
演示程序	`main.cpp`	逐个测试 RSA-128/256/512/768 密钥生成、加解密、签名与验签
其他文档	`API_DOCUMENTATION.md` 等	API、并行设计、性能评估

程序运行情况

密钥生成速度

对多种不同级别的 RSA 加密配置进行了测试。程序经过多重优化后效果如下表所示（详细输出 log 在标准阳历测试中）。可以看到当前版本的程序可以满足 1 秒内生成 RSA768 以内的密钥，同时我还测试了更高的密钥位数，发现在生成 RSA2048 时生成时间超过了 1 秒，随后生成时间快速上升（如 RSA4096 是 RSA2048 密钥生成时间的十倍）。这种非线性增长说明程序还存在优化空间。

注：随硬件配置、系统负载不同，实际耗时可能有所差异。

RSA 位数	密钥生成时间
128 bit	10 ms
256 bit	15 ms
512 bit	81 ms
768 bit	153 ms
1024 bit	258 ms
2048 bit	2291 ms
4096 bit	23753 ms

标准样例测试

在 main.cpp中已经内置好了一系列加解密与数字签名测试。样例输出如下：

=== RSA Algorithm Implementation ===
------------------------------------------------------------
=== Parallel RSA Key Generation Performance Test ===
Using multi-threading for parallel prime generation (p and q generated simultaneously)
Warm-up run: 232 ms
Testing RSA-128 key generation... ms
✓ Key generation completed in 10 msr RSA-768) ✓
  (Performance target: < 1000 ms for RSA-768) ✓05055990322430101213325990971750649962136958729052466088428676783480914144268268148598168652457299853177928487
Public Key (n): 16138742364727211635145156055129615575344281022286357992594973101670594949
Public Exponent (e): 65537
Private Exponent (d): 75260119344918601998128829850719132713...717266991...

Original message: Hello, RSA! This is a test message for RSA encryption.
Encrypted: 47239022291149104334886157197048391275:60814779436509520994166980980496001160:7122672044940916623050...
Decrypted: Hello, RSA! This is a test message for RSA encryption.
✓ Encryption/Decryption test PASSED

Message to sign: This is a message to sign.
Signature: 21306144489840798201238570290767253351:56825654981034270286611691834244275783...
✓ Signature verification PASSED
✓ Signature verification (wrong message) correctly REJECTED

------------------------------------------------------------

Testing RSA-256 key generation...
✓ Key generation completed in 15 ms
  (Performance target: < 1000 ms for RSA-768) ✓
Public Key (n): 71424393664292800942734583226616150431568794645910901617425554522125922656709
Public Exponent (e): 65537
Private Exponent (d): 24595354017055402775159590372740181621489700213082...

Original message: Hello, RSA! This is a test message for RSA encryption.
Encrypted: 9254821646878981571080132011678377293669894687663293312561873838708751179201:33837405534778353711845...
Decrypted: Hello, RSA! This is a test message for RSA encryption.
✓ Encryption/Decryption test PASSED

Message to sign: This is a message to sign.
Signature: 25866144904067994492879050946986627552251268535441036346080888442042111393322...
✓ Signature verification PASSED
✓ Signature verification (wrong message) correctly REJECTED

------------------------------------------------------------

Testing RSA-512 key generation...
✓ Key generation completed in 61 ms
  (Performance target: < 1000 ms for RSA-768) ✓
Public Key (n): 5720310257765414771279023907054107736079660673208619193358942898174033939552275195236762239388734531067210782408163142460361011214172635175974087868974897
Public Exponent (e): 65537
Private Exponent (d): 49783990138725599632183604160496382645341655220379...

Original message: Hello, RSA! This is a test message for RSA encryption.
Encrypted: 1666966457494095608024716523289397548975327451317190224239452618532340975603686399879491227245090010...
Decrypted: Hello, RSA! This is a test message for RSA encryption.
✓ Encryption/Decryption test PASSED

Message to sign: This is a message to sign.
Signature: 3552478420180876346791673801553733146517781154628885046393113047524763330105395736393565484109226745...
✓ Signature verification PASSED
✓ Signature verification (wrong message) correctly REJECTED

------------------------------------------------------------

Testing RSA-768 key generation...
✓ Key generation completed in 121 ms
  (Performance target: < 1000 ms for RSA-768) ✓
Public Key (n): 1386679386742326780568387665288890112608598325524290023386512642647288529655650712577242575912200367506224518265603338170354384179080796685104326597052090795664440853396187872936886163192196907789922330204357366687056359805838145699
Public Exponent (e): 65537
Private Exponent (d): 88276318254232899907065782357321489216997010450888...

Original message: Hello, RSA! This is a test message for RSA encryption.
Encrypted: 6567052269902344032545726807236999550176024344429379445552850589036802529841072951309652133198549166...
Decrypted: Hello, RSA! This is a test message for RSA encryption.
✓ Encryption/Decryption test PASSED

Message to sign: This is a message to sign.
Signature: 4457467765379265504523101841695024295834295403066668587076348121589587283331031343728155587275005522...
✓ Signature verification PASSED
✓ Signature verification (wrong message) correctly REJECTED

------------------------------------------------------------

Testing RSA-1024 key generation...
✓ Key generation completed in 254 ms
  (Performance target: < 1000 ms for RSA-768) ✓
Public Key (n): 148302575930553960390379235499303128887600259994814522187744273280670619235549456974058659861379745589033290276655489655565959748020996855987131532257846320707549508916299864206073682063232259877427266672423992306967480837813499338455834692966042713029333917578018545087643988406535155303035976291046615688461
Public Exponent (e): 65537
Private Exponent (d): 13822143129819944471924476956175036572794603172258...

Original message: Hello, RSA! This is a test message for RSA encryption.
Encrypted: 5255679430010989784251654478155524950535134461024821792457158298703155078548721733202348499530946781...
Decrypted: Hello, RSA! This is a test message for RSA encryption.
✓ Encryption/Decryption test PASSED

Message to sign: This is a message to sign.
Signature: 1659574054414920182592917535050225302374040773271770299301631970411099436914649462767829274218181843...
✓ Signature verification PASSED
✓ Signature verification (wrong message) correctly REJECTED

------------------------------------------------------------

Testing RSA-2048 key generation...
✓ Key generation completed in 1333 ms
Public Key (n): 28812188927005837992588008632898984919351829743707907561270264397172744873523343517936583013286893561092170357974089108237825886411578942130310431425546224858672111862679695224206116939065464493013864723056264579414506614553629344929359903785727549291385128165935553204484135657427218164477547317831856798963387907317360129742580400476130686754503085301348504902513188316154754698443825337202576873837916017512992934064549359464520947229300035603539924789820734373715145859081240766943003974773612621161322438420391481030162574548183870388496714646531226444223723403340122523426839565093015850557501321679417648237263
Public Exponent (e): 65537
Private Exponent (d): 12476328021720534604752355112273071593546936483157...

Original message: Hello, RSA! This is a test message for RSA encryption.
Encrypted: 2761998503924848017349667628572333433494433070564333913164819484761594835359648516173962280933250351...
Decrypted: Hello, RSA! This is a test message for RSA encryption.
✓ Encryption/Decryption test PASSED

Message to sign: This is a message to sign.
Signature: 5144739682010890587330896548982135924367232501729269986615358930852644683219025416656995527493850920...
✓ Signature verification PASSED
✓ Signature verification (wrong message) correctly REJECTED

------------------------------------------------------------

Testing RSA-4096 key generation...
✓ Key generation completed in 29436 ms
Public Key (n): 490075659514301156146999543173061365834430341292263446195280591841708143615811013296526199068716547468016630654092171490587186625136435009277053766600168885020804624009365738246551856129990001850751063733523725350036912906366559019270421306569695792352664476162383439096855076867027728763298416883614517188222295779225951476969195128846313571217757014344754670102831943521763098148984549195524510475612214021045343409676494098682944983449914916151879731976412523263919266824889239055587549798759589438994777801273404173369487006178812730147657525823251942895239401634490926951282681629194517477465449058858350052754291393735082165666696535628454704547691793857806474063218571263345068294156338841018533527108407381769238642911101783258632269040334804145922603970442616739955217146342451055693244470160509556444988990146895629671158129369814548750743440130146563735771144452090336650541945474801712358678429379052885574539281306162506191774338181018120014036162098983062158775474243858910019671562325688079102760493534760872369027575389518005775932858995250227206407725259006976775857548916220006677444972572362782659227493776631732390304998044325556946919625740371001029428781345393213548793341021562706961669685258023611975936307967
Public Exponent (e): 65537
Private Exponent (d): 12706357639908761837206183129524785584110716632189...

Original message: Hello, RSA! This is a test message for RSA encryption.
Encrypted: 2999268246162271299270920359635858183725912857792977170274258763426422721252549170616671231356315060...
Decrypted: Hello, RSA! This is a test message for RSA encryption.
✓ Encryption/Decryption test PASSED

Message to sign: This is a message to sign.
Signature: 3008615765609095036220238355268704679279093295779730883960209708809885678591166820248118207516535867...
✓ Signature verification PASSED
✓ Signature verification (wrong message) correctly REJECTED

------------------------------------------------------------

任意字符串测试

除了标准测试之外，还提供了额外的随机字符串输入接口，用户可以输入任意字符串进行测试。

以下是一个案例：

Enter a message to encrypt (or press Enter to skip): This is a RSA-C++ program. I like Cryptography because it's fun! 这是一个基于 C++语言实现的 RSA 项目.我喜欢密码学,密码学真有意思!&*!@#$%%……&*      

Generating keys for your message...
Keys generated in 151 ms
Encrypted message: 978391673660093288680289925656965994224553204431255751431341496518406257915032019511362279269319480343049597068702072184762392628767859920323799451701992828377510334369423111603601479687546817312515189131623643124175232794439862950:108242609230941347017922469065007174852873356856167262156936165964920843293607347388512014630046595709107476553156034600667047655774917999663267735526567104574603530454058879785995436230590980044324037649720034210938975259934359703
Decrypted message: This is a RSA-C++ program. I like Cryptography because it's fun! 这是一个基于 C++语言实现的 RSA 项目.我喜欢密码学,密码学真有意!@#$%%……&*
Digital signature: 582503111805146678833887670874125746382419626560858588812238645411134949064582569185908282716342948061793723046300814862404660245834177229830226145214119340322640357037581501209554765545845188786219302160231493763765059306357633818:473839537650908875517656767854644738739181236600844113336258268424424329640555620370182351650938758658347365548159832162630536337362229489730971448321865967773524772593855728403071897917761548374472807897433682436566430179770228314
Signature verified: Yes

核心算法与优化

32 位 Limb 大整数框架

Limb（肢） 是大数运算中的基本存储单元，类似于数字的"数字位"。在BigInteger类中，每个Limb是一个32位无符号整数（uint32_t），可以存储0到4,294,967,295（2³²-1）之间的值。

项目中的 BigInteger 以 32 位 limb（uint32_t）为基本单位搭建，整个整数表示可看成“一串以 $2^{32}$ 为基的高位次幂系数”。核心结构：

std::vector<uint32_t> digits：储存所有 limb，小端序（索引 0 为最低有效 limb）。
size_t top：有效 limb 个数（类似 OpenSSL Bn->top），配合 ensureCapacity() 与 setTop() 实现就地扩容、减少重新分配。
bool sign：符号位，true 表示负数。
cachedString / cacheValid：十进制字符串缓存，仅在需要时生成。

class BigInteger {
private:
    /// Base-2^32 digits stored least significant limb first.
    std::vector<Limb> digits;
    /// Count of meaningful limbs, mirroring OpenSSL's BN representation style.
    size_t top;
    /// Sign flag; @p false indicates non-negative values.
    bool sign;
    /// Cached base-10 string representation.
    mutable std::string cachedString;
    /// Indicates whether @p cachedString reflects the current numeric value.
    mutable bool cacheValid;
};

表示方式：每个 limb 是一个 uint32_t，底层以小端序排列（digits[0] 为最低位）。使用 top 记录有效 limb 个数，配合 ensureCapacity() 与 setTop() 实现就地运算与预扩容，减少分配开销。

一个整数 N 的值是：$N = \sum_{i=0}^{top-1} digits[i] \times (2^{32})^i$
例如 0x1234ABCD5678EF00 存储即 digits[0]=0x5678EF00，digits[1]=0x1234ABCD。由于 top 记录有效 limb，迭代运算可直接基于局部数组，不必每次删掉末尾 0。

内存与缓存策略

这一部分是本次实验的重中之重。在我的初始版本实现中，RSA 无法在规定时间内生成大于等于 RSA256 的密钥。经过单步调试与计时分析，我发现主要的计算开销在大数乘除法、大数模幂运算、大数进制转换等一些关键节点上。这促使我查阅了一些资料与代码库，尤其是 OpenSSL，使得我获得了一些灵感。

具体的内存与缓存策略如下：

ensureCapacity(size_t required)：确保向量容量至少 required，运算前一次性扩容。

void BigInteger::ensureCapacity(size_t required)
{
    if (digits.size() < required)
        digits.resize(required, 0);
}

setTop(size_t newTop)：更新有效 limb 个数，如果 newTop==0 则清零首 limb。

void BigInteger::setTop(size_t newTop)
{
    top = newTop;
    if (top == 0)
    {
        sign = false;
        if (digits.empty())
            digits.push_back(0);
        else
        {
            digits[0] = 0;
            digits.resize(1);
        }
    }
    else
    {
        ensureCapacity(top);
        digits.resize(top);
    }
    cacheValid = false;
}

normalize()：移除尾端零 limb，如果结果为 0 则置 top=0/sign=false，并标记缓存失效。

void BigInteger::normalize()
{
    if (digits.empty())
        digits.push_back(0);

    size_t idx = digits.size();
    while (idx > 1 && digits[idx - 1] == 0)
        --idx;

    if (idx == 1 && digits[0] == 0)
    {
        digits.resize(1);
        top = 0;
        sign = false;
    }
    else
    {
        digits.resize(idx);
        top = idx;
    }
    cacheValid = false;
}

addAbs / subAbs / mulAbs / mulAbsSmall 全部在 limb 层处理进位和借位，借助 uint64_t 或 unsigned __int128 获得中间结果；

size_t BigInteger::addAbs(const Limb* a, size_t aTop,
                          const Limb* b, size_t bTop,
                          std::vector<Limb>& out)
{
    size_t n = std::max(aTop, bTop);
    if (n == 0)
    {
        if (out.size() < 1)
            out.resize(1);
        out[0] = 0;
        out.resize(1);
        return 0;
    }

    if (out.size() < n + 1)
        out.resize(n + 1);
    std::fill(out.begin(), out.begin() + (n + 1), 0);
    uint64_t carry = 0;
    size_t i = 0;
    for (; i < n; ++i)
    {
        uint64_t sum = carry;
        if (i < aTop)
            sum += a[i];
        if (i < bTop)
            sum += b[i];
        out[i] = static_cast<Limb>(sum & LIMB_MASK);
        carry = sum >> LIMB_BITS;
    }

    if (carry)
        out[i++] = static_cast<Limb>(carry);

    while (i > 0 && out[i - 1] == 0)
        --i;

    if (i == 0)
    {
        out[0] = 0;
        out.resize(1);
        return 0;
    }

    out.resize(i);
    return i;
}

size_t BigInteger::subAbs(const Limb* a, size_t aTop,
                          const Limb* b, size_t bTop,
                          std::vector<Limb>& out)
{
    if (aTop == 0)
    {
        if (out.size() < 1)
            out.resize(1);
        out[0] = 0;
        out.resize(1);
        return 0;
    }

    if (out.size() < aTop)
        out.resize(aTop);
    std::fill(out.begin(), out.begin() + aTop, 0);
    int64_t borrow = 0;
    for (size_t i = 0; i < aTop; ++i)
    {
        int64_t cur = static_cast<int64_t>(a[i]) - (i < bTop ? static_cast<int64_t>(b[i]) : 0) - borrow;
        if (cur < 0)
        {
            cur += static_cast<int64_t>(LIMB_BASE);
            borrow = 1;
        }
        else
        {
            borrow = 0;
        }
        out[i] = static_cast<Limb>(cur);
    }
    size_t resTop = aTop;
    while (resTop > 0 && out[resTop - 1] == 0)
        --resTop;

    if (resTop == 0)
    {
        out[0] = 0;
        out.resize(1);
        return 0;
    }
    out.resize(resTop);
    return resTop;
}

size_t BigInteger::mulAbs(const Limb* a, size_t aTop,
                          const Limb* b, size_t bTop,
                          std::vector<Limb>& out)
{
    if (aTop == 0 || bTop == 0)
    {
        if (out.size() < 1)
            out.resize(1);
        out[0] = 0;
        out.resize(1);
        return 0;
    }

    size_t required = aTop + bTop + 1;
    if (out.size() < required)
        out.resize(required);
    std::fill(out.begin(), out.begin() + required, 0);
    for (size_t i = 0; i < aTop; ++i)
    {
        DoubleLimb carry = 0;
        for (size_t j = 0; j < bTop; ++j)
        {
            DoubleLimb cur = static_cast<DoubleLimb>(out[i + j])
                             + static_cast<DoubleLimb>(a[i]) * b[j]
                             + carry;
            out[i + j] = static_cast<Limb>(cur & LIMB_MASK);
            carry = cur >> LIMB_BITS;
        }
        size_t j = bTop;
        while (carry)
        {
            DoubleLimb cur = static_cast<DoubleLimb>(out[i + j]) + carry;
            out[i + j] = static_cast<Limb>(cur & LIMB_MASK);
            carry = cur >> LIMB_BITS;
            ++j;
        }
    }

    size_t resTop = aTop + bTop + 1;
    while (resTop > 0 && out[resTop - 1] == 0)
        --resTop;

    if (resTop == 0)
    {
        out[0] = 0;
        out.resize(1);
        return 0;
    }
    out.resize(resTop);
    return resTop;
}

size_t BigInteger::mulAbsSmall(const Limb* a, size_t aTop,
                               uint32_t b,
                               std::vector<Limb>& out)
{
    if (b == 0 || aTop == 0)
    {
        if (out.size() < 1)
            out.resize(1);
        out[0] = 0;
        out.resize(1);
        return 0;
    }

    if (out.size() < aTop + 1)
        out.resize(aTop + 1);
    std::fill(out.begin(), out.begin() + (aTop + 1), 0);
    DoubleLimb carry = 0;
    size_t i = 0;
    for (; i < aTop; ++i)
    {
        DoubleLimb cur = static_cast<DoubleLimb>(a[i]) * b + carry;
        out[i] = static_cast<Limb>(cur & LIMB_MASK);
        carry = cur >> LIMB_BITS;
    }
    if (carry)
        out[i++] = static_cast<Limb>(carry);

    while (i > 0 && out[i - 1] == 0)
        --i;

    if (i == 0)
    {
        out[0] = 0;
        out.resize(1);
        return 0;
    }
    out.resize(i);
    return i;
}

divMod 实现 Knuth 风格长除法，利用 subMulAt 和 addAt 细调候选商；

std::pair<BigInteger, BigInteger> BigInteger::divMod(const BigInteger& numerator,
                                                     const BigInteger& denominator)
{
    if (denominator.isZero())
        throw std::runtime_error("Division by zero");

    if (numerator.isZero())
        return {BigInteger(0), BigInteger(0)};

    BigInteger absA = numerator.absolute();
    BigInteger absB = denominator.absolute();
    absA.normalize();
    absB.normalize();

    int cmp = compareAbs(absA, absB);
    if (cmp < 0)
    {
        BigInteger remainder = numerator;
        remainder.normalize();
        return {BigInteger(0), remainder};
    }
    if (cmp == 0)
    {
        BigInteger quotient(1);
        quotient.sign = (numerator.sign != denominator.sign);
        quotient.normalize();
        BigInteger remainder(0);
        return {quotient, remainder};
    }

    if (absB.top == 1)
    {
        uint32_t den = absB.digits[0];
        uint32_t rem = 0;
        BigInteger quotient = divAbsSmall(absA, den, rem);
        quotient.sign = (numerator.sign != denominator.sign) && !quotient.isZero();
        quotient.normalize();

        BigInteger remainder = fromUint64(rem);
        if (numerator.sign && !remainder.isZero())
        {
            remainder = absB - remainder;
            remainder.sign = false;
        }
        else
        {
            remainder.sign = false;
        }
        remainder.normalize();
        return {quotient, remainder};
    }

    uint32_t norm = static_cast<uint32_t>(LIMB_BASE / (static_cast<uint64_t>(absB.digits.back()) + 1));
    std::vector<Limb> u;
    size_t uTop = mulAbsSmall(absA.digits.data(), absA.top, norm, u);
    std::vector<Limb> v;
    size_t vTop = mulAbsSmall(absB.digits.data(), absB.top, norm, v);

    u.push_back(0);
    uTop = u.size();

    if (u.size() <= v.size())
    {
        u.push_back(0);
        uTop = u.size();
    }

    size_t n = v.size();
    size_t m = u.size() - n - 1;

    std::vector<Limb> q(m + 1, 0);

    for (int64_t j = static_cast<int64_t>(m); j >= 0; --j)
    {
        DoubleLimb numerator2 = (static_cast<DoubleLimb>(u[j + n]) << LIMB_BITS)
                                | static_cast<DoubleLimb>(u[j + n - 1]);
        uint64_t qt = static_cast<uint64_t>(numerator2 / v[n - 1]);
        uint64_t rt = static_cast<uint64_t>(numerator2 % v[n - 1]);

        if (qt >= LIMB_BASE)
        {
            qt = LIMB_BASE - 1;
            rt = static_cast<uint64_t>(numerator2 - static_cast<DoubleLimb>(qt) * v[n - 1]);
        }

        if (n > 1)
        {
            uint64_t vn2 = v[n - 2];
            uint64_t uj2 = u[j + n - 2];
            while (qt == LIMB_BASE || static_cast<DoubleLimb>(qt) * vn2 > (static_cast<DoubleLimb>(rt) << LIMB_BITS) + uj2)
            {
                --qt;
                rt += v[n - 1];
                if (rt >= LIMB_BASE)
                    break;
            }
        }

        if (subMulAt(u, uTop, v, vTop, qt, static_cast<size_t>(j)))
        {
            --qt;
            uTop = addAt(u, uTop, v, vTop, static_cast<size_t>(j));
        }
        q[static_cast<size_t>(j)] = static_cast<uint32_t>(qt);
    }

    std::vector<Limb> remainderDigits(u.begin(), u.begin() + n);
    BigInteger remainderTemp;
    remainderTemp.digits = remainderDigits;
    remainderTemp.top = remainderDigits.size();
    remainderTemp.sign = false;
    remainderTemp.normalize();

    uint32_t remNorm = 0;
    BigInteger remainder = divAbsSmall(remainderTemp, norm, remNorm);
    remainder.sign = false;
    remainder.normalize();

    BigInteger quotient;
    quotient.digits = q;
    quotient.top = q.size();
    quotient.normalize();
    quotient.sign = (numerator.sign != denominator.sign) && !quotient.isZero();

    if (numerator.sign && !remainder.isZero())
    {
        if (denominator.sign)
            quotient = quotient + BigInteger(1);
        else
            quotient = quotient - BigInteger(1);

        remainder = absB - remainder;
        remainder.sign = false;
        remainder.normalize();
    }

    return {quotient, remainder};
}

bitLength() / isOdd() / testBit() 均直接在 limb 位操作上完成，取代昂贵的十进制转换。

size_t BigInteger::bitLength() const
{
    if (top == 0)
        return 0;
    size_t bits = (top - 1) * LIMB_BITS;
    Limb highest = digits[top - 1];
    if (highest == 0)
        return bits;
#if defined(__GNUG__) || defined(__clang__)
    unsigned int hi = highest;
    int leading = __builtin_clz(hi);
    bits += LIMB_BITS - static_cast<size_t>(leading);
#else
    size_t local = 0;
    Limb temp = highest;
    while (temp)
    {
        ++local;
        temp >>= 1U;
    }
    bits += local;
#endif
    return bits;
}

bool BigInteger::isOdd() const
{
    return top != 0 && (digits[0] & 1U) != 0;
}

bool BigInteger::testBit(size_t index) const
{
    size_t limbIndex = index / LIMB_BITS;
    if (limbIndex >= top)
        return false;
    size_t bitOffset = index % LIMB_BITS;
    return ((digits[limbIndex] >> bitOffset) & 1U) != 0;
}

符号与缓存：负号通过 sign 标记维护；normalize() 清理高位 0 并同步 sign；cachedString 仅在需要十进制表示时更新。

2. 大数运算并行优化

在 limb 级算术稳定后，项目还提供 BigIntegerParallel 作为“加速器”，在输入规模超过阈值时切换至 Karatsuba 乘法并通过 std::async 并行化子任务：

BigInteger BigIntegerParallel::multiplyParallel(const BigInteger& a,
                                                const BigInteger& b,
                                                int depth) {
    if (a.isZero() || b.isZero()) return BigInteger(0);

    std::vector<uint32_t> resultDigits;
    size_t aLimbs = a.significantLimbs();
    size_t bLimbs = b.significantLimbs();
    if (std::min(aLimbs, bLimbs) <= KARATSUBA_THRESHOLD) {
        resultDigits = BigInteger::mulAbs(a.digits, b.digits);
    } else {
        resultDigits = karatsubaMultiplyVec(a.digits, b.digits, depth);
    }
    bool resultSign = (a.sign != b.sign);
    return createBigInteger(resultDigits, resultSign);
}

若递归深度仍在允许的范围内，三个子乘法会被分派到不同线程执行：

if (useParallel) {
    z0_future = std::async(std::launch::async, karatsubaMultiplyVec, x0, y0, depth + 1);
    z1_future = std::async(std::launch::async, karatsubaMultiplyVec, x1, y0, depth + 1);
    z2_future = std::async(std::launch::async, karatsubaMultiplyVec, x0, y1, depth + 1);
    z3 = karatsubaMultiplyVec(x1, y1, depth + 1);      // 主线程继续处理其中一支
    z0 = z0_future.get();
    z1 = z1_future.get();
    z2 = z2_future.get();
} else {
    z0 = karatsubaMultiplyVec(x0, y0, depth + 1);
    z1 = karatsubaMultiplyVec(x1, y0, depth + 1);
    z2 = karatsubaMultiplyVec(x0, y1, depth + 1);
    z3 = karatsubaMultiplyVec(x1, y1, depth + 1);
}

这样在 512 bit 以上的大乘法场景下可以显著缩短运行时间。

蒙哥马利模幂优化

蒙哥马利模乘（Montgomery Modular Multiplication）由Peter L. Montgomery在1985年论文《Modular multiplication without trial division》中提出，其核心思想是：将昂贵的模运算转换为廉价的移位和加法运算。项目中的 MontgomeryReducer 在构造时完成以下准备：

MontgomeryReducer::MontgomeryReducer(const BigInteger& mod)
    : modulus(mod), limbCount(0), nInv(0)
{
    // 归一化、缓存 limb 表达，计算 nInv 与 R^2 mod n
    modulusDigits = modulus.digits;
    limbCount = modulusDigits.size();
    nInv = computeInverse32(modulusDigits[0]);
    r2 = r % modulus;
    montOne = toMontgomery(BigInteger(1));
}

在蒙哥马利域内的核心操作包括转换、乘法与快速幂：

BigInteger MontgomeryReducer::modPow(const BigInteger& base,
                                     const BigInteger& exponent) const {
    if (exponent.getSign()) throw std::invalid_argument("negative exponent");
    if (exponent.isZero()) return BigInteger(1);

    BigInteger baseMont = toMontgomery(reduce(base)); // 转换到蒙哥马利域
    BigInteger result = montOne;
    BigInteger current = baseMont;

    for (long bit = static_cast<long>(exponent.bitLength()) - 1; bit >= 0; --bit) {
        result = multiply(result, result);          // 域内平方
        if (exponent.testBit(static_cast<size_t>(bit)))
            result = multiply(result, current);     // 域内乘法
    }
    return fromMontgomery(result);                  // 转回普通表示
}

BigInteger MontgomeryReducer::multiply(const BigInteger& a,
                                       const BigInteger& b) const {
    std::vector<uint32_t> product = BigInteger::mulAbs(a.digits, b.digits);
    if (product.size() < limbCount * 2 + 2)
        product.resize(limbCount * 2 + 2, 0);
    return montgomeryReduce(std::move(product));
}

montgomeryReduce 将中间结果逐 limb 消除低位，并在末尾与 modulus 比较确保结果合法：

BigInteger MontgomeryReducer::montgomeryReduce(std::vector<uint32_t> t) const {
    for (size_t i = 0; i < limbCount; ++i) {
        uint64_t m = (static_cast<uint64_t>(t[i]) * nInv) & BigInteger::BASE_MASK;
        unsigned __int128 carry = 0;
        for (size_t j = 0; j < limbCount; ++j) {
            unsigned __int128 cur =
                static_cast<unsigned __int128>(t[i + j]) +
                static_cast<unsigned __int128>(m) * modulusDigits[j] + carry;
            t[i + j] = static_cast<uint32_t>(cur & BigInteger::BASE_MASK);
            carry = cur >> 32;
        }
        // 将可能产生的 carry 继续传播
    }
    // 结果位于 t[limbCount..]，如超出 modulus 再减一次
}

通过蒙哥马利域，模幂运算不再包含除法，成为一系列恒定时间的 limb 操作，从而显著提升 RSA 的性能与稳定性。

Miller–Rabin 素性检测并行优化

初筛：使用包含所有 <1000 的素数表进行 trial division，快速剔除明显合数；
候选生成：调整奇偶、使用轮式增量避免重复尝试；
Miller–Rabin：
- 根据候选位长动态确定轮数（≥256 bit 时默认 32 轮）；
- 测试基在 Montgomery 域内复用，省去重复转换；
- 检测到潜在 witness 后立即返回；
并行策略：generatePrimeWithSeed() 中使用 std::async 批量测试候选数，结合线程局部随机源减少冲突。

std::vector<future<bool>> futures;
for (const auto& candidate : candidateBatch) {
    futures.push_back(std::async(std::launch::async,
        [candidate, millerRabinIterations]() {
            if (!quickTrialDivision(candidate, 100)) {
                return false;
            }
            return isPrimeFast(candidate, millerRabinIterations);
        }));
}

for (size_t i = 0; i < candidateBatch.size(); ++i) {
    if (futures[i].get()) {
        return candidateBatch[i];
    }
}

RSA 密钥生成并行优化

RSA 密钥生成调用上述素性检测流程，并行地构造两个大素数 p、q：

RSA::KeyPair RSA::generateKeyPair(int bitLength) {
    int primeLength = bitLength / 2;
    auto deadline = std::chrono::steady_clock::now() + std::chrono::seconds(30);

    auto futureP = std::async(std::launch::async,
                              &RSA::generatePrimeParallel, primeLength, 1, deadline);
    auto futureQ = std::async(std::launch::async,
                              &RSA::generatePrimeParallel, primeLength, 2, deadline);

    BigInteger p = futureP.get();
    BigInteger q = futureQ.get();
    BigInteger n = p * q;
    BigInteger phi = (p - BigInteger(1)) * (q - BigInteger(1));
    BigInteger e(65537);
    if (gcd(e, phi) != BigInteger(1)) {
        // 调整 e，确保与 phi 互素
        e = BigInteger(3);
        while (gcd(e, phi) != BigInteger(1)) {
            e = e + BigInteger(2);
        }
    }
    BigInteger d = modInverse(e, phi);
    // ...
}

后续的 encrypt / decrypt、sign / verify 函数则复用 Montgomery 模幂完成核心运算。

构建与运行

mkdir -p build
cd build
cmake ..
cmake --build .
./rsa

程序会依次输出 RSA-128/256/512/768 的密钥生成耗时、模数信息，并验证加解密与签名验签是否通过。

如需将算法嵌入其他项目，可直接链接 BigInteger.cpp、Montgomery.cpp 与 RSA.cpp，并调用 RSA::generateKeyPair、RSA::encrypt、RSA::decrypt、RSA::sign、RSA::verify 等接口。

参考文献

Peter L. Montgomery, “Modular multiplication without trial division,” Mathematics of Computation, 1985.
Donald E. Knuth, The Art of Computer Programming, Vol. 2: Seminumerical Algorithms, 3rd ed., 1997.
A. A. Karatsuba, “Multiplication of multidigit numbers on automata,” Soviet Physics Doklady, 1962.
Gary L. Miller, “Riemann’s Hypothesis and Tests for Primality,” Journal of Computer and System Sciences, 1976.
Rivest, Shamir, Adleman, “A Method for Obtaining Digital Signatures and Public-Key Cryptosystems,” Communications of the ACM, 1978.

参考代码

https://github.com/panks/BigInteger：一个 C++大数运算实现
https://github.com/sybrenstuvel/python-rsa：Python 版本的 RSA 算法实现
https://github.com/openssl/openssl： OpenSSL 框架

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
BigInteger.cpp		BigInteger.cpp
BigInteger.h		BigInteger.h
BigIntegerParallel.cpp		BigIntegerParallel.cpp
BigIntegerParallel.h		BigIntegerParallel.h
CMakeLists.txt		CMakeLists.txt
Montgomery.cpp		Montgomery.cpp
Montgomery.h		Montgomery.h
README.md		README.md
RSA.cpp		RSA.cpp
RSA.h		RSA.h
main.cpp		main.cpp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RSA

实验基本情况

实验感悟

目录导览

程序运行情况

密钥生成速度

标准样例测试

任意字符串测试

核心算法与优化

32 位 Limb 大整数框架

内存与缓存策略

2. 大数运算并行优化

蒙哥马利模幂优化

Miller–Rabin 素性检测并行优化

RSA 密钥生成并行优化

构建与运行

参考文献

参考代码

About

Uh oh!

Releases

Packages

Languages

DigitalLifeYZQiu/RSA

Folders and files

Latest commit

History

Repository files navigation

RSA

实验基本情况

实验感悟

目录导览

程序运行情况

密钥生成速度

标准样例测试

任意字符串测试

核心算法与优化

32 位 Limb 大整数框架

内存与缓存策略

2. 大数运算并行优化

蒙哥马利模幂优化

Miller–Rabin 素性检测并行优化

RSA 密钥生成并行优化

构建与运行

参考文献

参考代码

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages