-
Notifications
You must be signed in to change notification settings - Fork 86
Scale Abacus
We recommend at least two of each Abacus applications for high availability if instances crash or if you need to restart them individually.
Abacus supports two ways to scale its micro-services:
- instances
- applications
This is the typical/normal way to scale a stateless application in Cloud Foundry - by adding more instances. Each instance has a number. You cannot access a specific instance as the load balancing is delegated to the routing layer.
When Abacus needs to serialize the update of documents it uses stateful applications. Abacus has to partition the data to a specific service instance to preserve the state.
To achieve this, we use indexed applications instead of instances. The micro-service instances are named meter-0
, meter-1
, and so on. Each of these applications can be accessed and is responsible to process a partition of the data.
Since the applications are stateful they need to be known upfront from the previous stages of the pipeline. For example, collector needs to know the number of meter instances. This is part of the profile. It can be changed in .apprc
or in application's manifest.yml
file.
Abacus has both stateful and stateless micro-services. Stateless ones can be scaled with instances and stateful ones need to be scaled using indexed applications.
Stateless, receives batches of submitted usage over HTTP, does 1 db write per batch, 1 db write per usage doc.
Scale to provide better response time to Resource providers as they submit usage.
Stateless, receives individual submitted usage docs from collector, does 2 db writes per usage doc.
Size it the same or a bit more than the collector app as it's processing more (individual) docs than the submitted batches.
Stateful as it accumulates usage per resource instance, does 2 db writes per usage doc, 1 read per approx 100 usage docs.
Serializes updates to the accumulated usage per resource instance, so increase if your individual resource instances are getting a lot of usage:
- resource instances are distributed to db partitions, one partition per instance, and that instance is the only reader/writer from/to that partition
- the performance of the accumulator scales linearly from 1 to 16 instances, recommend to test its performance in your environment
Stateful as it aggregates usage per organization, does 2 db writes per usage doc, 1 read per approx 100 usage docs.
Same performance characteristics and observations as for the accumulator, except that the write serialization is on an organization basis.
Stateless, 1 db read per report per org.
Scales like a regular Web app, gated by the query performance on your db:
- 2 instances minimum for availability then increase as your reporting load increases
- delegates org lookups to your account info service so include the performance of that service in your analysis as well
We scale Abacus via profiles. The profile contains information on how many applications/instances and db partitions to use for a certain application.
Abacus supports several profiles defined here. You can use these profiles when starting, stopping, or deploying Abacus to Cloud Foundry.
For example, you can push an Abacus pipeline to Cloud Foundry capable of handling more load by using the profile large
:
npm run cfpush -- large
The profiles are inherited from the default
profile. This allows the profiles to specify only the settings that are specific for this profile.
The settings in the profiles most often include the number of instances, applications, and db partitions for the application. For example, the small profile defines the following scale:
- collector with 2 apps / 1 instance / 1 db partition
- meter with 1 apps / 1 instance / 2 db partitions
- accumulator with 1 apps / 1 instance / 2 db partitions
- aggregator with 1 apps / 1 instance / 2 db partitions
- reporting with 1 apps / 1 instance / 2 db partitions
You can modify the profiles or add additional ones to comply with the needs of your own installation.
We conducted several experiments to determine the optimal values for the profiles. The results can be found on the Performance page.
Abacus applications work with with time-based keys and with key-based keys in two DBs: input and output.
Usually this can be configured to one per month, as most db writes and reads target the current month, and sometimes the previous month. With that, monthly DBs can be archived once they're not needed anymore.
If you expect a huge load on Abacus, you can increase the number of partitions to speed up the DB processing.
The number of partitions depends on how many resource instances and organizations you have and the performance of your database as its volume increases.
For the accumulator and aggregator services, you need one db partition per app instance, reserved to that instance. This DB can be further partitioned depending on the number of documents flowing into Abacus.
If you reduce the number of partitions (from 6 to 4 for instance) this will lead to inability to access some of the data, since Abacus will no longer know about the data in the last 2 partitions.
The MongoDB support includes app-level sharing. Abacus can be configured to use more than one DB and do round-robin for partitions.
If your DB is getting full, you can create a new service instance and bind it to applications (most often this is needed by aggregator). Abacus will start creating new partitions in the new DB.
This however means that:
- the new DB will be used for next month data (and starting next month)
- you need alerting when your DB is getting full (70%)
Abacus Broker project provides Housekeeper application that deletes DBs older than certain months. By default it is configured to 3 months.
You should consider any legal requirements for data retention when configuring the Housekeeper.
<< Deploy Abacus Broker | Secure Abacus >> |
---|
ABOUT | RESOURCE PROVIDER | ABACUS INTEGRATOR
*Abacus icon made by Freepik from www.flaticon.com