Skip to content

feat(storage): Add full object read checksum validation for Open#16120

Open
v-pratap wants to merge 4 commits into
googleapis:mainfrom
v-pratap:full-read-in-open
Open

feat(storage): Add full object read checksum validation for Open#16120
v-pratap wants to merge 4 commits into
googleapis:mainfrom
v-pratap:full-read-in-open

Conversation

@v-pratap
Copy link
Copy Markdown
Contributor

No description provided.

@v-pratap v-pratap requested review from a team as code owners May 27, 2026 10:09
@product-auto-label product-auto-label Bot added the api: storage Issues related to the Cloud Storage API. label May 27, 2026
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements checksum validation (CRC32C and MD5) for asynchronous reads by integrating hash validators into ObjectDescriptorImpl and ReadRange. However, several critical issues were identified during the review. First, the checksum validation in OnRead is performed prematurely before the final chunk of data is processed, which will cause false checksum mismatches. Second, the entire integration test suite was accidentally deleted and replaced with a single test containing hardcoded credentials. Finally, multiple debugging std::cout statements were left in the production code and should be removed.

I am having trouble creating individual review comments. Click here to see my feedback.

google/cloud/storage/internal/async/read_range.cc (74-89)

critical

Correctness Bug: Incorrect Checksum Validation Order

The range_end checksum validation is performed at the very beginning of OnRead, before the current message's checksummed_data is processed and incorporated into the hash_function_.

Since the last message in a GCS read stream typically contains both the final chunk of data and range_end = true, calling hash_function_->Finish() here will exclude the last chunk from the computed hash, resulting in a false checksum mismatch error.

Fix: Defer the range_end check and object-level checksum validation until after the chunk's data has been successfully processed and added to the hash function (i.e., after the chunk validation and offset updates).

google/cloud/storage/tests/async_client_integration_test.cc (234-242)

critical

Critical Issue: Accidental Deletion of Integration Tests & Hardcoded Credentials

It appears that the entire integration test suite for AsyncClient was accidentally deleted and replaced with a single manual test (StartAppendableUploadEmpty) containing a hardcoded project ID ("bajajnehaa-devrel-test") and bucket name.

This will break the CI/CD pipeline and prevent other developers from running the tests.

Fix: Please revert the changes to async_client_integration_test.cc to restore the full integration test suite, and avoid committing hardcoded project/bucket names.

google/cloud/storage/internal/async/object_descriptor_impl.cc (159-166)

medium

Please remove these debugging std::cout statements before merging. They are not suitable for production code.

  if (options_.get<storage::EnableCrc32cValidationOption>()) {
    hash_function =
        std::make_shared<storage::internal::Crc32cMessageHashFunction>(
            std::make_unique<storage::internal::Crc32cHashFunction>());
  }

google/cloud/storage/internal/async/object_descriptor_impl.cc (188-203)

medium

Please remove the debugging std::cout statements from the metadata checksum processing block.

    if (metadata_->has_checksums()) {
      auto const& checksums = metadata_->checksums();
      if (checksums.has_crc32c()) {
        hashes = Merge(std::move(hashes),
                       storage::internal::HashValues{
                           storage_internal::Crc32cFromProto(checksums.crc32c()), {}});
      }
      if (!checksums.md5_hash().empty()) {
        hashes = Merge(std::move(hashes),
                       storage::internal::HashValues{
                           {}, storage_internal::MD5FromProto(checksums.md5_hash())});
      }
    }

google/cloud/storage/internal/async/read_range.cc (99-101)

medium

Please remove this debugging std::cout statement.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 27, 2026

Codecov Report

❌ Patch coverage is 92.53731% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 92.71%. Comparing base (77109ef) to head (65f342f).

Files with missing lines Patch % Lines
...d/storage/internal/async/object_descriptor_impl.cc 83.87% 5 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main   #16120   +/-   ##
=======================================
  Coverage   92.71%   92.71%           
=======================================
  Files        2353     2353           
  Lines      219274   219335   +61     
=======================================
+ Hits       203303   203362   +59     
- Misses      15971    15973    +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: storage Issues related to the Cloud Storage API.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant