Skip to content

Commit

Permalink
Update of PMPY for Jupiter/AIP 2.0 structure (#142)
Browse files Browse the repository at this point in the history
Including:

* Add connection to database to replace owner ids with email addresses
* File order file constructed through list source proxies
* Put AIP version in bag metadata
* SolrFetcher isn't used anymore, removed
  • Loading branch information
cwant authored Mar 26, 2018
1 parent 1242ace commit 6dcfd4e
Show file tree
Hide file tree
Showing 52 changed files with 6,362 additions and 2,351 deletions.
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,11 +17,11 @@ Its primary job is to manage the flow of content from Fedora into Swift for pres

## Workflow

1. Any save (create or update) on a GenericFile in ERA will trigger an after save callback that will push the GenericFile unique identifier (NOID) into a Queue.
2. The queue (Redis) is setup to be a unique set (which only allows one GenericFile NOID to be included in the queue at a single time), and ordered by priority from First In, First out (FIFO).
1. Any save (create or update) on a Item/Thesis in ERA/Jupiter will trigger an after save callback that will push the item's unique identifier (UUID or NOID) into a Queue.
2. The queue (Redis) is setup to be a unique set (which only allows one item's UUID to be included in the queue at a single time), and ordered by priority from First In, First out (FIFO).
3. PushmiPullyu will then monitor the queue. After a certain wait period has passed since an element has been on the queue, PushmiPullyu will then retrieve the elements off the queue and begin to process the preservation event.
4. All the GenericFile information and data required for preservation are retrieved from Fedora and Solr using multiple REST calls.
5. An Archival Information Package (AIP) is created from the GenericFile's information. It is then bagged and tarred.
4. All the GenericFile information and data required for preservation are retrieved from Fedora using multiple REST calls. A database connection to the user database fetches (via ActiveRecord )owner emails and modifies the fetched documents, where applicable.
5. An Archival Information Package (AIP) is created from the item's information. It is then bagged and tarred.
6. The AIP tar is then uploaded to Swift via a REST call.
7. On a successful Swift upload, a entry is added for this preservation event to the preservation event logs.

Expand Down
Binary file modified docs/images/system-infrastructure-diagram.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
10 changes: 7 additions & 3 deletions examples/pushmi_pullyu.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
# PushmiPullyu will run this file through ERB when reading it so you can
# even put in dynamic logic, like consuming ENV Variables.

aip_version: 'lightaip-2.0'
debug: false
logdir: log
monitor: false
Expand All @@ -20,15 +21,18 @@ minimum_age: 0
redis:
url: redis://localhost:6379

solr:
url: http://localhost:8983/solr/development

fedora:
url: http://localhost:8080/fcrepo/rest
user: fedoraAdmin
password: fedoraAdmin
base_path: /dev

database:
encoding: utf8
url: postgresql://jupiter:mysecretpassword@127.0.0.1
database: jupiter_development
pool: 5

#parameters project_name and project_domain_name are required only for keystone v3 authentication
swift:
tenant: tester
Expand Down
16 changes: 11 additions & 5 deletions lib/pushmi_pullyu.rb
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,11 @@
require 'pushmi_pullyu/aip'
require 'pushmi_pullyu/aip/creator'
require 'pushmi_pullyu/aip/downloader'
require 'pushmi_pullyu/aip/solr_fetcher'
require 'pushmi_pullyu/aip/fedora_fetcher'
require 'pushmi_pullyu/aip/file_list_creator'
require 'pushmi_pullyu/aip/owner_email_editor'
require 'active_record'
require 'pushmi_pullyu/aip/user'
require 'pushmi_pullyu/cli'
require 'pushmi_pullyu/preservation_queue'
require 'pushmi_pullyu/swift_depositer'
Expand All @@ -20,6 +23,7 @@
# PushmiPullyu main module
module PushmiPullyu
DEFAULTS = {
aip_version: 'lightaip-2.0',
daemonize: false,
debug: false,
logdir: 'log',
Expand All @@ -32,10 +36,6 @@ module PushmiPullyu
redis: {
url: 'redis://localhost:6379'
},
# TODO: rest of these are examples for solr/fedora/swift... feel free to fill them in correctly
solr: {
url: 'http://localhost:8983/solr/development'
},
fedora: {
url: 'http://localhost:8080/fcrepo/rest',
user: 'fedoraAdmin',
Expand All @@ -52,6 +52,12 @@ module PushmiPullyu
container: 'ERA'
},
rollbar: {
},
database: {
encoding: 'utf8',
pool: ENV['RAILS_MAX_THREADS'] || 5,
url: ENV['DATABASE_URL'] || ENV['JUPITER_DATABASE_URL'] || 'postgresql://jupiter:mysecretpassword@127.0.0.1',
database: 'jupiter_development'
}
}.freeze

Expand Down
6 changes: 5 additions & 1 deletion lib/pushmi_pullyu/aip/creator.rb
Original file line number Diff line number Diff line change
Expand Up @@ -22,11 +22,15 @@ def run
private

def bag_aip
bag = BagIt::Bag.new(@aip_directory)
bag = BagIt::Bag.new(@aip_directory, bag_metadata)
bag.manifest!
raise BagInvalid unless bag.valid?
end

def bag_metadata
{ 'AIP-Version' => PushmiPullyu.options[:aip_version] }
end

def tar_bag
# We want to change the directory to the work directory path so we get the tar file to be exactly
# the contents of the noid directory and not the entire work directory structure. For example the noid.tar
Expand Down
Loading

0 comments on commit 6dcfd4e

Please sign in to comment.