-
-
Notifications
You must be signed in to change notification settings - Fork 640
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
another form of the sentence splitting function (Testing...Does NOT work) #473
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Running the Dev test workflow for this file as well as you can see here https://github.com/DrewThomasson/ebook2audiobook/actions/runs/13792746551 |
Nope locally testing it, shows this funtion breaks, removing the workflow from the que |
Full log drew@wmughal-CN4D09397T ebook2audiobook % ./ebook2audiobook.sh
v25.3.10 native mode
IPs available for connection:
['127.0.0.1', '::1', 'fe80::1%lo0', '10.5.167.48', 'fe80::825:8530:ff35:8d36%en0', 'fe80::f4d4:88ff:fe9d:7259%ap1', 'fe80::1891:7cff:fedd:85b8%awdl0', 'fe80::1891:7cff:fedd:85b8%llw0', 'fe80::a4fb:2110:4319:2225%utun0', 'fe80::ddfe:7bd7:2805:ee6e%utun1', 'fe80::ce81:b1c:bd2c:69e%utun2', 'fe80::fd5:243a:9e68:883b%utun3', 'fe80::bde5:2d9b:824a:4557%utun4', 'fe80::3667:9ce3:d5c6:d826%utun6']
Note: 0.0.0.0 is not the IP to connect. Instead use an IP above to connect.
* Running on local URL: http://0.0.0.0:7860
To create a public link, set `share=True` in `launch()`.
Traceback (most recent call last):
File "/Users/drew/ebook2audiobook/python_env/lib/python3.12/site-packages/gradio/queueing.py", line 625, in process_events
response = await route_utils.call_process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/drew/ebook2audiobook/python_env/lib/python3.12/site-packages/gradio/route_utils.py", line 322, in call_process_api
output = await app.get_blocks().process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/drew/ebook2audiobook/python_env/lib/python3.12/site-packages/gradio/blocks.py", line 2099, in process_api
inputs = await self.preprocess_data(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/drew/ebook2audiobook/python_env/lib/python3.12/site-packages/gradio/blocks.py", line 1794, in preprocess_data
processed_input.append(block.preprocess(inputs_cached))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/drew/ebook2audiobook/python_env/lib/python3.12/site-packages/gradio/components/dropdown.py", line 202, in preprocess
raise Error(
gradio.exceptions.Error: 'Value: internal is not in the list of choices: []'
Processing eBook file: alice.txt
GPU is not available on your device!
Available Processor Unit: cpu
Running command: /opt/homebrew/bin/ebook-convert /Users/drew/ebook2audiobook/tmp/ebook-01205268-1020-4b59-84b2-29b7a9eb8fb0/0abcd0ee87c3df04dd58415c1f7788af/alice.txt /Users/drew/ebook2audiobook/tmp/ebook-01205268-1020-4b59-84b2-29b7a9eb8fb0/0abcd0ee87c3df04dd58415c1f7788af/__alice.epub
Conversion options changed from defaults:
output_profile: 'generic_eink'
input_encoding: 'utf-8'
epub_version: '3'
smarten_punctuation: True
verbose: 1
disable_font_rescaling: True
flow_size: 0
1% Converting input to HTML...
InputFormatPlugin: TXT Input running
on /Users/drew/ebook2audiobook/tmp/ebook-01205268-1020-4b59-84b2-29b7a9eb8fb0/0abcd0ee87c3df04dd58415c1f7788af/alice.txt
Reading text from file...
Using user specified input encoding of utf-8
Auto detected paragraph type as unformatted
Auto detected formatting as heuristic
Running text through basic conversion...
Language not specified
Creator not specified
Building file list...
Found files...
HTMLFile:0:a:'/Users/drew/ebook2audiobook/tmp/calibre_7.24.0_tmp_q157msxy/tiuz74so_plumber/index.html'
Normalizing filename cases
Rewriting HTML links
Parsing index.html ...
********* Heuristic processing HTML *********
There are 16 blank lines. 0.41025641025641024 percent blank
minimum chapters required are: 1
found 0 pre-existing headings
Total wordcount is: 1615, Average words per section is: 1615, Marked up 0 chapters
deleting blank lines
Hard line breaks check returned False
Median line length is 252, calculated with html format
Fixing hyphenated content
Looking for more split points based on punctuation, currently have 0
Formatting scene breaks
Forcing index.html into XHTML namespace
34% Running transforms on e-book...
Merging user specified metadata...
Detecting structure...
Auto generated TOC with 0 entries.
Flattening CSS and remapping font sizes...
Source base font size is 12.00000pt
Removing fake margins...
Found 23 items of level: p_1
Ignoring level p_1
Cleaning up manifest...
Trimming unused files from manifest...
Creating EPUB Output...
67% Running EPUB Output plugin
Splitting markup on page breaks and flow limits, if any...
Generating default cover
This EPUB file has no Table of Contents. Creating a default TOC
Upgrading to EPUB 3...
EPUB output written to /Users/drew/ebook2audiobook/tmp/ebook-01205268-1020-4b59-84b2-29b7a9eb8fb0/0abcd0ee87c3df04dd58415c1f7788af/__alice.epub
Output saved to /Users/drew/ebook2audiobook/tmp/ebook-01205268-1020-4b59-84b2-29b7a9eb8fb0/0abcd0ee87c3df04dd58415c1f7788af/__alice.epub
/Users/drew/ebook2audiobook/python_env/lib/python3.12/site-packages/ebooklib/epub.py:1423: FutureWarning: This search incorrectly ignores the root element, and will be fixed in a future version. If you rely on the current behaviour, change it to './/xmlns:rootfile[@media-type]'
for root_file in tree.findall('//xmlns:rootfile[@media-type]', namespaces={'xmlns': NAMESPACES['CONTAINERNS']}):
******* NOTE: YOU CAN SAFELY IGNORE "Character xx not found in the vocabulary." *******
Error extracting main content pages: maximum recursion depth exceeded
Traceback (most recent call last):
File "/Users/drew/ebook2audiobook/lib/functions.py", line 544, in get_chapters
doc_cache[doc] = filter_chapter(doc, session['language'], session['language_iso1'], session['tts_engine'])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/drew/ebook2audiobook/lib/functions.py", line 591, in filter_chapter
chapter_sentences = get_sentences(phoneme_list, max_tokens)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/drew/ebook2audiobook/lib/functions.py", line 688, in get_sentences
sentences.extend(advanced_split(current_sentence.strip()))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/drew/ebook2audiobook/lib/functions.py", line 678, in advanced_split
return advanced_split(part1) + advanced_split(part2)
^^^^^^^^^^^^^^^^^^^^^
File "/Users/drew/ebook2audiobook/lib/functions.py", line 678, in advanced_split
return advanced_split(part1) + advanced_split(part2)
^^^^^^^^^^^^^^^^^^^^^
File "/Users/drew/ebook2audiobook/lib/functions.py", line 678, in advanced_split
return advanced_split(part1) + advanced_split(part2)
^^^^^^^^^^^^^^^^^^^^^
[Previous line repeated 986 more times]
File "/Users/drew/ebook2audiobook/lib/functions.py", line 671, in advanced_split
if any(p in sentence for p in punctuation_split):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RecursionError: maximum recursion depth exceeded
Caught DependencyError: Error extracting main content pages: maximum recursion depth exceeded
get_chapters() failed!
^CKeyboard interruption in main thread... closing server.
^CServer interrupted by user. Shutting down...
Traceback (most recent call last):
File "/Users/drew/ebook2audiobook/python_env/lib/python3.12/site-packages/gradio/blocks.py", line 2959, in block_thread
time.sleep(0.1)
KeyboardInterrupt
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/drew/ebook2audiobook/lib/functions.py", line 2769, in web_interface
outputs=[gr_voice_list, gr_custom_model_list, gr_audiobook_list, gr_modal]
File "/Users/drew/ebook2audiobook/python_env/lib/python3.12/site-packages/gradio/blocks.py", line 2865, in launch
self.block_thread()
File "/Users/drew/ebook2audiobook/python_env/lib/python3.12/site-packages/gradio/blocks.py", line 2963, in block_thread
self.server.close()
File "/Users/drew/ebook2audiobook/python_env/lib/python3.12/site-packages/gradio/http_server.py", line 69, in close
self.thread.join(timeout=5)
File "/Users/drew/ebook2audiobook/python_env/lib/python3.12/threading.py", line 1153, in join
self._wait_for_tstate_lock(timeout=max(timeout, 0))
File "/Users/drew/ebook2audiobook/python_env/lib/python3.12/threading.py", line 1169, in _wait_for_tstate_lock
if lock.acquire(block, timeout):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
KeyboardInterrupt
Caught DependencyError: Server interrupted by user. Shutting down...
^C
|
So not merging then... |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.