Use a larger (or even configurable) buffer size in filecmp for performance

# Feature or enhancement

### Proposal:

Currently, `filecmp.cmp(f1, f2, shallow=False)` will iterate over both files in chunks of `BUFSIZE = 8 * 1024` bytes (i.e. 8 KiB). Previously there was a rejected suggestion (#83370) to unify this with `io.DEFAULT_BUFFER_SIZE`, but it appears that the equal value is coincidental.

I propose to use a considerably larger buffer here. The issue is that even if the OS is fundamentally reading the file a page at a time, the overhead of processing the file a page at a time in Python code slows everything down. For example on my system:

```
$ wc -c testfile.bin 
212699823 testfile.bin

$ for i in $(seq 12 25); do (size=$(echo "2 ^ $i" | bc); echo "Buffer size: $size bytes"; python -m timeit "with open('testfile.bin', 'rb') as f:" "    while f.read($size): pass"; echo); done
Buffer size: 4096 bytes
5 loops, best of 5: 83.8 msec per loop

Buffer size: 8192 bytes
5 loops, best of 5: 52 msec per loop

Buffer size: 16384 bytes
10 loops, best of 5: 38.3 msec per loop

Buffer size: 32768 bytes
10 loops, best of 5: 29.6 msec per loop

Buffer size: 65536 bytes
10 loops, best of 5: 26.3 msec per loop

Buffer size: 131072 bytes
10 loops, best of 5: 25.8 msec per loop

Buffer size: 262144 bytes
10 loops, best of 5: 25.5 msec per loop

Buffer size: 524288 bytes
10 loops, best of 5: 25.1 msec per loop

Buffer size: 1048576 bytes
10 loops, best of 5: 25.1 msec per loop

Buffer size: 2097152 bytes
10 loops, best of 5: 26.1 msec per loop

Buffer size: 4194304 bytes
10 loops, best of 5: 29.1 msec per loop

Buffer size: 8388608 bytes
10 loops, best of 5: 33.2 msec per loop

Buffer size: 16777216 bytes
10 loops, best of 5: 34.4 msec per loop

Buffer size: 33554432 bytes
2 loops, best of 5: 110 msec per loop
```

Even though the disk/OS page size is 4KiB, reading in 8KiB chunks (as already happens) is faster, but it is consistently faster still (as much as 2x) with chunks up to at least 64KiB and possibly as much as 512KiB. And nowadays these are still tiny amounts of memory to use for a temporary buffer. (For completeness, I include the serious performance impact of using a very large buffer, where it seems Linux has switched to a different strategy for the underlying `malloc()`.)

----

Rather than *just* hard-coding a buffer size, it could also be provided as a keyword-only parameter:

```
def cmp(f1, f2, shallow=True, *, bufsize=BUFSIZE):
    # ...
    if outcome is None:
        outcome = _do_cmp(f1, f2, bufsize)
    # ...

def _do_cmp(f1, f2, bufsize):
   # as before, but without the assignment from the global
```

But since different designs are possible, I wanted to discuss before submitting a PR.

If there are other places in the standard library that iterate over a file(-like object) in chunks, this should probably be considered there as well.

### Has this already been discussed elsewhere?

This is a minor feature, which does not need previous discussion elsewhere

### Links to previous discussion of this feature:

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use a larger (or even configurable) buffer size in filecmp for performance #150539

Feature or enhancement

Proposal:

Has this already been discussed elsewhere?

Links to previous discussion of this feature:

Metadata

Assignees

Labels

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Use a larger (or even configurable) buffer size in filecmp for performance #150539

Description

Feature or enhancement

Proposal:

Has this already been discussed elsewhere?

Links to previous discussion of this feature:

Metadata

Metadata

Assignees

Labels

Fields

Projects

Milestone

Relationships

Development

Issue actions