|
1 | 1 | # Readwise Mirror Plugin
|
2 | 2 |
|
| 3 | +> [!WARNING] |
| 4 | +> Readwise Mirror 2.x is a major rewrite of the plugin which might break things due to changes how filenames are generated and validated. The documentation contains a step-by-step guide how you can prepare an existing Readwise library by adding the `uri` tracking property to your items before upgrading to ensure links to items in your Readwise library will be updated. You can find the guide [here](#upgrading-from-1xx-to-2xx). |
| 5 | +
|
3 | 6 | **Readwise Mirror** is an unoffical open source plugin for the powerful note-taking and knowledge-base application [Obsidian](http://obsidian.md/). This plugin allows a user to "mirror" their entire Readwise library by automatically downloading all highlights/notes and syncing changes directly into an Obsidian vault.
|
4 | 7 |
|
5 | 8 | 
|
@@ -429,66 +432,113 @@ Rendered output:
|
429 | 432 |
|
430 | 433 | ## Deduplication
|
431 | 434 |
|
432 |
| -If File Tracking is enabled, the plugin prevents duplicate files when articles are re-imported from Readwise, maintaining link consistency in your vault. This can be useful in a number of cases: |
| 435 | +The plugin implements a deduplication strategy to handle both tracked and untracked files, ensuring consistency in your vault while preserving existing content and links. |
433 | 436 |
|
434 |
| -- if you change the character used to escape the colon `:` in your titles, |
435 |
| -- when changing to use the "Slugify" feature, or changing its options, and |
436 |
| -- if the title of a Readwise item changes at the source |
| 437 | +### File Tracking |
437 | 438 |
|
438 |
| -### How It Works |
| 439 | +When File Tracking is enabled (via `trackFiles` and `trackingProperty` settings), the plugin uses the `highlights_url` property to track unique documents from Readwise. |
439 | 440 |
|
440 |
| -1. **File Matching** |
441 |
| - - Uses MetadataCache to find files with matching `readwise_url` |
442 |
| - - Checks all vault locations, not just the Readwise folder |
443 |
| - - Honors existing file structure |
| 441 | +## Deduplication Strategy |
444 | 442 |
|
445 |
| -2. **Update Strategy** |
446 |
| - - If exact filename match exists: Updates content in place |
447 |
| - - If different filename exists: Updates first matching file, and changes this file's filename to the new filename |
448 |
| - - Additional matches: Either deleted or marked as duplicates (default, with the `duplicate` property set to `true`) |
| 443 | +### Path-Based Grouping |
449 | 444 |
|
450 |
| -3. **Link Preservation** |
451 |
| - - Maintains existing internal links |
452 |
| - - Preserves file locations in vault |
453 |
| - - Updates content while keeping references intact |
| 445 | +Files are first grouped by their normalized path (category + filename), handling potential case-sensitivity issues across different filesystems. This ensures consistent behavior regardless of your operating system but leaves different items with the same filename in different categories untouched (e.g. Books and Supplemental Books). |
454 | 446 |
|
455 |
| -### Duplicate Handling |
| 447 | +### Processing Logic |
456 | 448 |
|
457 |
| -If File Tracking is enabled, when duplicates are found: |
| 449 | +#### For Tracked Files (File Tracking Enabled) |
458 | 450 |
|
459 |
| -1. **Exact Match** |
| 451 | +1. **Existing Files with Matching `highlights_url`** |
460 | 452 |
|
461 | 453 | ```shell
|
462 |
| - 📄 "My Article.md" (existing) |
463 |
| - └── Updates content in place |
| 454 | + 📄 "My Article.md" (primary, matching highlights_url) |
| 455 | + └── Updates content and frontmatter |
| 456 | + 📄 "Same Article.md" (duplicate, matching highlights_url) |
| 457 | + └── Either deleted or marked as duplicate: true |
464 | 458 | ```
|
465 | 459 |
|
466 |
| -2. **Different Filename** |
| 460 | +2. **Path Collision (Different `highlights_url`)** |
467 | 461 |
|
468 | 462 | ```shell
|
469 |
| - 📄 "Article (2024).md" (existing) |
470 |
| - |
471 |
| - └── Updates content, changes filename to "My Article.md" |
| 463 | + 📄 "My Article.md" (existing file) |
| 464 | + 📄 "My Article <hash>.md" (new file) |
| 465 | + └── Creates new file with hash suffix |
472 | 466 | ```
|
473 | 467 |
|
474 |
| -3. **Multiple Matches** |
| 468 | +#### For Untracked Files (File Tracking Disabled) |
475 | 469 |
|
476 |
| - ```shell |
477 |
| - 📄 "My Article.md" (primary) |
478 |
| - └── Updated with new content |
479 |
| - 📄 "Same Article.md" (duplicate) |
480 |
| - └── Deleted or marked with duplicate: true |
481 |
| - ``` |
| 470 | +When multiple files share the same path: |
| 471 | + |
| 472 | +```shell |
| 473 | +📄 "My Article.md" (first file) |
| 474 | +📄 "My Article <hash1>.md" (second file) |
| 475 | +📄 "My Article <hash2>.md" (third file) |
| 476 | +└── First file keeps original name, others get unique hashes |
| 477 | +``` |
| 478 | + |
| 479 | +### File Operations |
| 480 | + |
| 481 | +The plugin carefully manages file operations to maintain vault consistency: |
| 482 | + |
| 483 | +1. **Content Updates** |
| 484 | + - Preserves original file creation and modification timestamps |
| 485 | + - Selectively updates frontmatter based on `updateFrontmatter` setting |
| 486 | + - Handles filename changes while maintaining internal links and metadata |
| 487 | + |
| 488 | +2. **Duplicate Management** |
| 489 | + Based on your settings, duplicates are handled in one of two ways: |
| 490 | + - When `deleteDuplicates: true`, duplicate files are moved to trash |
| 491 | + - When `deleteDuplicates: false`, duplicates are marked with `duplicate: true` in frontmatter |
| 492 | + |
| 493 | +## Special Considerations |
| 494 | + |
| 495 | +### Filename Changes |
| 496 | + |
| 497 | +The plugin implements a robust strategy for handling filename changes: |
| 498 | + |
| 499 | +1. First attempts a direct rename to the new filename |
| 500 | +2. If a file already exists at the target path, creates a new file with a hash suffix |
| 501 | +3. Throughout the process, preserves all metadata and internal links |
482 | 502 |
|
483 |
| -### Deduplication limitations |
| 503 | +### Remote Duplicates |
484 | 504 |
|
485 |
| -Currently, the following limitations apply to deduplication: |
| 505 | +Readwise can contain multiple items sharing the same title but with different IDs. The plugin handles these cases by: |
| 506 | + |
| 507 | +1. Using the plain filename (e.g. `My Duplicate Book.md`) for the first encountered item |
| 508 | +2. Adding a short hash of the Readwise ID to subsequent files (e.g. `My Duplicate Book <HASH>.md`) |
| 509 | + |
| 510 | +## Deduplication Limitations |
| 511 | + |
| 512 | +The current implementation has several important considerations: |
| 513 | + |
| 514 | +- File ordering affects clean filename assignment, though we mitigate this by sorting by Readwise ID (ascending) |
| 515 | +- Initial setup requires a full sync to establish proper tracking properties |
| 516 | +- During the initial full sync, local modifications may be overwritten |
| 517 | +- Platform differences in case-sensitivity are handled through normalized path comparison |
| 518 | + |
| 519 | +## Best Practices |
| 520 | + |
| 521 | +To get the most out of the deduplication system: |
| 522 | + |
| 523 | +1. Enable File Tracking for the most reliable deduplication experience |
| 524 | +2. Run a full sync when first enabling tracking |
| 525 | +3. Consider maintaining unique titles in Readwise to minimize hash suffix usage |
| 526 | + |
| 527 | +## Upgrading from 1.x.x to 2.x.x |
| 528 | + |
| 529 | +If you are upgrading from 1.x.x to 2.x.x, and want to preserve your existing links to items in your Readwise library, you need to follow these steps before upgrading the plugin: |
| 530 | + |
| 531 | +1. Make sure you have a backup of your vault (or at least your Readwise Mirror folder) |
| 532 | +2. In the plugin settings, add the `uri` tracking property to the Frontmatter template. Just add the following at the end of the template and enable Frontmatter[^1]: |
| 533 | + |
| 534 | + ```yaml |
| 535 | + uri: {{ highlights_url }} |
| 536 | + ``` |
486 | 537 |
|
487 |
| -- Readwise items with the exact same title will be detected, the first one in the export will be used to write to your vault |
488 |
| -- To start using deduplication, you have to run a full sync to make sure all your files have the deduplication frontmatter property and can thus be deduplicated. This means any changes you made to your local files will be lost (this is not a new behaviour though, but you should be aware of it) |
| 538 | +3. Run **a full sync** to establish proper tracking properties (this will overwrite your local changes, but will preserve the filenames of your existing files according to the way version 1.4.11 of the plugin creates them) |
489 | 539 |
|
490 |
| -### Readwise (remote) duplicates |
| 540 | +4. Upgrade the plugin to 2.x.x and enable File Tracking |
491 | 541 |
|
492 |
| -In Readwise, multiple items with the same title but different `id`s can exist. This leads to a filename collision in `readwise-mirror`. If such a duplicate ('remote duplicate') is detected (because a file already exists with the "same" filename), the plugin will write a file which has a short hash of the Readwise `id` value added to the filename of all detected duplicates. |
| 542 | +Your subsequent syncs will then use the `uri` property to track unique files and ensure links to items in your Readwise library will be updated, even if the generated filenames change with the new version of the plugin. |
493 | 543 |
|
494 |
| -The filename of two different Readwise items both titled `My Duplicate Book` would thus become `My Duplicate Book.md` and `My Duplicate Book <HASH>.md` where `<HASH>` would be a short hashed `id` value of the second item the plugin encounters when downloading. As this order can change between runs of the plugin (e.g., because of changes to one item which changes the order in the returned data), the filenames might change as well from run to run. |
| 544 | +[^1]: You might want to ensure that properties like `author` are omitted from the template as these have a tendency to break frontmatter. Alternatively, you can use the `authorStr` variable, or run a plugin like "Linter" to check and fix all your Readwise notes before upgrading. |
0 commit comments