You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Create a pdf-loader folder in the repo and add an unstructured_loader.pdf file
Write the file using unstructured's docs to take as input a pdf and as output a markdown document (if that's what they suggest)
Sidenote: you should create a folder in your repo called data/ and store the actual pdf there so that you can retrieve it from the code. You should then upload the result as a markdown document in the same folder. If git starts tracking it then add data/ to the gitignore. it might already be there though.
The text was updated successfully, but these errors were encountered:
@alvaro-mazcu and I started working on this today. We parsed the first chapter of the geography book and started looking into what kind of metadata we can use. Also tested parsing a math paper to explore formula parsing
Steps:
Sidenote: you should create a folder in your repo called data/ and store the actual pdf there so that you can retrieve it from the code. You should then upload the result as a markdown document in the same folder. If git starts tracking it then add data/ to the gitignore. it might already be there though.
The text was updated successfully, but these errors were encountered: