Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extracting a corrupt 7z file and try to delete it will cause the PermissionError in windows #597

Open
ok-oldking opened this issue Jul 21, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@ok-oldking
Copy link

import os

import py7zr

try:
    with py7zr.SevenZipFile('corrupt.zip', mode='r') as z:
        z.extractall('test')
except Exception as e:
    os.remove('corrupt.zip')

When I’m extracting a particular corrupted 7z file, I get the PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: ‘corrupt.7z’. This error only occurs with this corrupted file. When I test with other random invalid 7z files, the error does not occur.
Because the file is 80MB, I can't upload it to github, here is the google drive link

Traceback (most recent call last):
  File "D:\projects\ok-wuthering-waves\test.py", line 7, in <module>
    z.extractall('test')
  File "D:\projects\ok-wuthering-waves\venv\Lib\site-packages\py7zr\py7zr.py", line 999, in extractall
    self._extract(path=path, return_dict=False, callback=callback)
  File "D:\projects\ok-wuthering-waves\venv\Lib\site-packages\py7zr\py7zr.py", line 629, in _extract
    self.worker.extract(
  File "D:\projects\ok-wuthering-waves\venv\Lib\site-packages\py7zr\py7zr.py", line 1313, in extract
    raise exc_info[1].with_traceback(exc_info[2])
  File "D:\projects\ok-wuthering-waves\venv\Lib\site-packages\py7zr\py7zr.py", line 1338, in extract_single
    self._extract_single(fp, files, path, src_end, q, skip_notarget)
  File "D:\projects\ok-wuthering-waves\venv\Lib\site-packages\py7zr\py7zr.py", line 1407, in _extract_single
    crc32 = self.decompress(fp, f.folder, obfp, f.uncompressed, f.compressed, src_end, q)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\projects\ok-wuthering-waves\venv\Lib\site-packages\py7zr\py7zr.py", line 1466, in decompress
    tmp = decompressor.decompress(fp, min(out_remaining, max_block_size))
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\projects\ok-wuthering-waves\venv\Lib\site-packages\py7zr\compressor.py", line 721, in decompress
    tmp = self._decompress(data, max_length)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\projects\ok-wuthering-waves\venv\Lib\site-packages\py7zr\compressor.py", line 677, in _decompress
    data = decompressor.decompress(data, max_length)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
_lzma.LZMAError: Corrupt input data

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:\projects\ok-wuthering-waves\test.py", line 9, in <module>
    os.remove('corrupt.7z')
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'corrupt.7z'
@miurahr miurahr added the question Further information is requested label Jul 23, 2024
@miurahr
Copy link
Owner

miurahr commented Oct 13, 2024

You can read the suggestion what AI explained the situation.

Why doesn’t Windows unlock the file immediately?

Even though your program calls close() on a file, there are a few reasons why Windows (and sometimes other operating systems) might not release the lock right away:


1. Delayed or Buffered Writes

  • What happens: When a program writes to a file, Windows may not write the data to disk immediately. It temporarily stores it in a buffer to optimize performance.

  • Impact: The file might appear locked for a short time, even after closing, as Windows finishes the background work (like flushing the buffer).

  • Solution: You can use flush() before closing the file to force immediate write to disk.

Example in Python:

with open("example.txt", "w") as f:
    f.write("Hello, World!")
    f.flush()  # Ensures everything is written immediately.
# The file will still be properly closed and unlocked.

2. Antivirus or Backup Software Scanning

  • What happens: Some antivirus or backup tools monitor files when they are opened or written to. After your program closes the file, the antivirus might still be holding the lock while scanning.

  • Impact: Even though your program is done, the antivirus causes the file to remain temporarily locked.

  • Solution: Try disabling or excluding the file from antivirus scans during development to see if it resolves the issue.


3. Multiple File Handles (By Accident)

  • What happens: If your program accidentally opens a file more than once without properly closing all instances, the file may stay locked even after one handle closes.
  • Impact: This is a common beginner issue when opening files multiple times without tracking handles.

Example of an issue:

f1 = open("example.txt", "w")
f2 = open("example.txt", "r")  # Oops! Now two handles are open.
f1.close()  # Only one is closed; the other is still open.
# The file is still locked because f2 is open.
  • Solution: Always use with statements to ensure all files are properly closed automatically.

4. Pending OS Operations or File System Latency

  • What happens: On network drives or slow disks, the operating system might take some time to release the file lock due to internal caching or delays.
  • Impact: The program finishes, but the file may appear locked for a moment longer.

Summary and Best Practices

  • Use flush() if you’re writing and want immediate disk writes.
  • Use with statements to safely manage file operations.
  • Be aware of antivirus software interfering.
  • Avoid accidentally opening the same file more than once without closing all handles.

By keeping these points in mind, you can avoid file-locking issues and ensure your programs behave as expected on Windows!

@miurahr
Copy link
Owner

miurahr commented Oct 13, 2024

This may be happened because SevenZipFile raises the exception when opening a given corrupted file.

SevenZipFile class defines

    def __exit__(self, exc_type, exc_val, exc_tb):
        self.close()

    def _fpclose(self) -> None:
        assert self._fileRefCnt > 0
        self._fileRefCnt -= 1
        if not self._fileRefCnt and not self._filePassed:
            self.fp.close()
 
    def close(self):
        """Flush all the data into archive and close it.
        When close py7zr start reading target and writing actual archive file.
        """
        if "w" in self.mode:
            self._write_flush()
        if "a" in self.mode:
            self._write_flush()
        if "r" in self.mode:
            if self.reporterd is not None:
                self.q.put_nowait(None)
                self.reporterd.join(1)
                if self.reporterd.is_alive():
                    raise InternalError("Progress report thread terminate error.")
                self.reporterd = None
        self._fpclose()
        self._var_release()

@miurahr
Copy link
Owner

miurahr commented Oct 13, 2024

SevenZipFile class supports a multithreaded extraction, so when another working process exists, it does not close immediately. It may be a reason.

@miurahr miurahr added bug Something isn't working and removed question Further information is requested labels Oct 13, 2024
@miurahr
Copy link
Owner

miurahr commented Oct 13, 2024

        assert self._fileRefCnt > 0
        self._fileRefCnt -= 1
        if not self._fileRefCnt and

This part may cause the unclosed status.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants