FIX: filename decryption issue #10

dhgatjeye · 2025-01-25T21:07:36Z

Problem Description:
When decrypting URL-encoded filenames with Turkish and other languages characters, the function incorrectly transforms the original filename. Example:

Original Filename: Yeni Klasör 2
Decrypted Filename: Yeni+klas%C3%B6r+%282%29

The issue stems from incomplete Unicode character handling during URL decoding, which can:

Corrupt special characters
Misinterpret Turkish character encodings
Potentially break file naming across different systems

And I solved that problems in that pr.

Also you can run the tests;

import pathlib
from urllib.parse import unquote
from typing import Union


def fix_filename(path: Union[str, pathlib.Path]) -> pathlib.Path:
    path = pathlib.Path(str(path))

    parts = []
    for part in path.parts:
        if part == path.drive or part == '/':
            parts.append(part)
            continue

        decoded = str(part)
        if '%' in decoded:
            decoded = decoded.replace('%2B', '§PLUS§')
            decoded = unquote(decoded)
            decoded = decoded.replace('+', ' ')
            decoded = decoded.replace('§PLUS§', '+')
  
        INVALID_CHARS = '<>:"|?*\0'
        DEVICE_NAMES = {'CON', 'PRN', 'AUX', 'NUL', 'COM1', 'COM2', 'COM3', 'COM4',
                        'COM5', 'COM6', 'COM7', 'COM8', 'COM9', 'LPT1', 'LPT2',
                        'LPT3', 'LPT4', 'LPT5', 'LPT6', 'LPT7', 'LPT8', 'LPT9'}

        cleaned = ''.join(c if c not in INVALID_CHARS and ord(c) >= 32 else '-' for c in decoded)
        cleaned = cleaned.strip('. ')

        if cleaned.upper() in DEVICE_NAMES:
            cleaned = f'_{cleaned}_'

        if not cleaned or set(cleaned) <= {' ', '+'}:
            if all(c == '+' for c in cleaned):
                parts.append(cleaned)
            else:
                parts.append('_')
            continue

        parts.append(cleaned)

    return pathlib.Path(*parts)


def test_filename_fixes():
    test_cases = [
        ("Yeni+klas%C3%B6r+%282%29", "Yeni klasör (2)"),
        ("Dosya%20adı.txt", "Dosya adı.txt"),
        ("Yeni klasör (2)+", "Yeni klasör (2)+"),
        ("Document+.txt", "Document+.txt"),
        ("Hello+World%2B", "Hello World+"),
        ("Test%2B+File", "Test+ File"),
        ("%D0%9F%D1%80%D0%B8%D0%BC%D0%B5%D1%80%2B", "Пример+"),
        ("%E4%BD%A0%E5%A5%BD+%2B+File.txt", "你好 + File.txt"),
        ("%F0%9F%98%80+Smile%2B", "😀 Smile+"),
        ("plain_filename.txt", "plain_filename.txt"),
        ("Hello World+.txt", "Hello World+.txt"),
        ("%2B%2B%2B", "+++"),
        ("%2BFile%2B", "+File+"),
        ("No+encoding%21", "No encoding!"),
        ("File%20Name%21%40%23%24.txt", "File Name!@#$.txt"),
        ("%C3%87%C4%B1lg%C4%B1n+Dosya.txt", "Çılgın Dosya.txt"),
    ]

    failed_cases = []
    for encoded, expected in test_cases:
        result = fix_filename(pathlib.Path(encoded))
        if str(result) != expected:
            failed_cases.append({
                'input': encoded,
                'expected': expected,
                'got': result
            })
            print(f"\nTest case:")
            print(f"Input:    {encoded}")
            print(f"Expected: {expected}")
            print(f"Got:      {result}")
            print(f"Pass:     False")

    if failed_cases:
        print("\nFailed Test Cases:")
        for case in failed_cases:
            print(f"Input: {case['input']}")
            print(f"Expected: {case['expected']}")
            print(f"Got: {case['got']}\n")
    else:
        print("All test cases passed successfully!")

test_filename_fixes()

giacomoferretti · 2025-02-05T07:48:17Z

Gonna test as soon as possible. The code looks right, but because I never stumbled upon it, I prefer to test it.

dhgatjeye · 2025-02-05T18:20:36Z

Gonna test as soon as possible. The code looks right, but because I never stumbled upon it, I prefer to test it.

Yeah okey! thank u

dhgatjeye added 8 commits January 23, 2025 22:59

added new feature

d522090

added cookie.py

2cb4184

added cookie

5f33103

Update README.md

6e381ef

Update README.md

35857d5

Update cookie.py

9d110db

Update README.md

ab643d2

FIX: filename decryption issue

a499ebd

dhgatjeye requested a review from giacomoferretti as a code owner January 25, 2025 21:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FIX: filename decryption issue #10

FIX: filename decryption issue #10

dhgatjeye commented Jan 25, 2025

giacomoferretti commented Feb 5, 2025

dhgatjeye commented Feb 5, 2025

FIX: filename decryption issue #10

Are you sure you want to change the base?

FIX: filename decryption issue #10

Conversation

dhgatjeye commented Jan 25, 2025

giacomoferretti commented Feb 5, 2025

dhgatjeye commented Feb 5, 2025