Skip to content

Latest commit

 

History

History
1814 lines (1751 loc) · 195 KB

CHANGELOG.md

File metadata and controls

1814 lines (1751 loc) · 195 KB

Changelog for Peloton

0.9.0 (unreleased)

0.8.12

0.8.11

0.8.10

0.8.9.6

  • 2019-06-25 Use LOCAL_SERIAL for CAS operations on Cassandra varung@uber.com
  • 2019-06-31 Revert "Remove task_config table" avyas@uber.com
  • 2019-06-25 Drop 0030_drop_mv_job schema migration and rename 0031_add_host_infos kevinxu@uber.com
  • 2019-06-23 Skip default task config not found error due task_config deprecation kevinxu@uber.com
  • 2019-06-19 Fix hostmgr out of index zhixin@uber.com

0.8.9.1

  • 2019-07-17 Fix delay due to lock contention in job cache sachins@uber.com
  • 2019-07-10 Remove task_config table zhixin@uber.com
  • 2019-07-10 (Part 1)Migrate active_jobs table to ORM yuweit@uber.com
  • 2019-07-09 Fix a failure test in TestMesosAgentFailure zhaokai@uber.com
  • 2019-07-09 Add JobUpdateEventsORM zhixin@uber.com
  • 2019-07-09 [P2K merge #3] v1alpha eventstream for p2k and evaluate host label constraints yiran@uber.com
  • 2019-07-09 Revert "fixed test_in_place_update_with_agent_stop in TestMesosAgentFailure, fix config" amitbose@uber.com
  • 2019-07-09 fixed test_in_place_update_with_agent_stop in TestMesosAgentFailure, fix config zhaokai@uber.com
  • 2019-07-09 Define SLAInfo and add SLA Awareness to Host Maintenance sachins@uber.com
  • 2019-07-09 Migrate GetPodEvents to use ORM and Add OptionalInt custom type sishi@uber.com
  • 2019-07-08 Handle placement with stale run-id in resmgr tracker sachins@uber.com
  • 2019-07-08 Upgrade grpc to version v1.22.0 zhixin@uber.com
  • 2019-07-08 Add info log for tasks failed to get placed on desired host zhixin@uber.com
  • 2019-07-08 Fix report generation issue yuweit@uber.com
  • 2019-07-08 ORM: Add HostInfo object representing a host in maintenance rcharles@uber.com
  • 2019-07-03 Host Maintenance: add host state validation in handler rcharles@uber.com
  • 2019-07-03 (Part 2) ResPool ORM migration yuweit@uber.com
  • 2019-07-03 Migration jobStore.GetAllJobsInJobIndex to use ORM yuweit@uber.com
  • 2019-07-03 Fix desired host placement deadline zhixin@uber.com
  • 2019-07-03 Minicluster: Fixed make bin/kind target. pourchet@uber.com
  • 2019-07-03 Fix error handling in store.deletePodEventsOnDeleteJob sachins@uber.com
  • 2019-07-03 HostManager: HostCache: Setup pod event handling for resource accounting pourchet@uber.com
  • 2019-07-02 Return YARPC Errors in HostMgr HostMaintenance related handlers rcharles@uber.com
  • 2019-07-02 Skip test__abort_auto_rollback_with_pinned_instances_and_update kevinxu@uber.com
  • 2019-07-02 Temporary allow deprecated HostMaintenance API request fields rcharles@uber.com
  • 2019-07-02 [DB schema and code] Get rid of mv_jobs_by_state from schema adityacb@uber.com
  • 2019-07-02 Fix build breakage in hostmgr adityacb@uber.com
  • 2019-07-02 Placement Engine: Implemented ToMimirGroup for v1alpha lease object pourchet@uber.com
  • 2019-07-01 Placement Engine: Implemented PortRange functionality in v1alpha lease object pourchet@uber.com
  • 2019-07-01 [P2K merge #2] Basic K8s plugin and host cache implementation yiran@uber.com
  • 2019-06-30 Placement Engine: Rough implementation of v1alpha offers service pourchet@uber.com
  • 2019-06-28 (Part 1)ResourcePool migration for ORM yuweit@uber.com
  • 2019-06-28 Added the PortRange protobuf for use in placement engine pourchet@uber.com
  • 2019-06-28 Drop deprecated fields in resmgr.Placement proto message sachins@uber.com
  • 2019-06-27 Host Maintenance APIs: refactor write API calls to a single host instead of a list rcharles@uber.com
  • 2019-06-27 ORM: Introduce type OptionalString for string primary keys rcharles@uber.com
  • 2019-06-27 Fix performance report error yuweit@uber.com
  • 2019-06-27 Host Manager: Added TerminateLeases protobuf definition pourchet@uber.com
  • 2019-06-27 [DB schema and code] Add podspec and version to task_config table adityacb@uber.com
  • 2019-06-27 Deprecate usage of mv_task_by_state in query sishi@uber.com
  • 2019-06-26 Change queryJobIDs to use QueryJobCache private api kevinxu@uber.com
  • 2019-06-26 Add PARALLEL_STATELESS_UPDATE and STATELESS_CREATE tables in performance tests yuweit@uber.com
  • 2019-06-26 Deprecate usage of mv_task_by_state in job runtime updater sishi@uber.com
  • 2019-06-26 Placement Engine: Config flag to determine v0 or v1alpha hostmanager pourchet@uber.com
  • 2019-06-26 All task writes should be done via job in cache sachins@uber.com
  • 2019-06-26 Enable integration tests for spread placement amitbose@uber.com
  • 2019-06-25 Default updateEvents and instanceEvents empty array instead of nil kevinxu@uber.com
  • 2019-06-25 Fix version number on db migration script sishi@uber.com
  • 2019-06-25 Add rate limit middleware in aurora bridge zhixin@uber.com
  • 2019-06-25 Placement Engine: Move v0 of offers service to own package pourchet@uber.com
  • 2019-06-24 Update secret_info table to use leveled compaction strategy sishi@uber.com
  • 2019-06-24 Placement Engine: Remove hostService member from engine struct pourchet@uber.com
  • 2019-06-24 [DB schema + code] Start writing JobSpec blob to DB as a new job_config column adityacb@uber.com
  • 2019-06-24 Placement Engine: Use model interfaces everywhere pourchet@uber.com
  • 2019-06-21 Implement service tag for StartJobUpdate metrics kevinxu@uber.com
  • 2019-06-21 Notify listeners when task runtime is replaced in cache sachins@uber.com
  • 2019-06-21 Debug log for browse sandbox response varung@uber.com
  • 2019-06-21 Do not copy pip.conf to peloton docker image sachins@uber.com
  • 2019-06-21 Add failure test for in-place update zhixin@uber.com
  • 2019-06-20 DeleteTask should release task lock before notifying listeners sachins@uber.com
  • 2019-06-20 Add sleep for aurora bridge failure tests for lucene index to converge varung@uber.com
  • 2019-06-20 Remove need to run aurora bridge integration test twice varung@uber.com
  • 2019-06-20 Placement Engine: Small cleanup of types and functions pourchet@uber.com
  • 2019-06-19 API: Request on a single host instead of list of hosts for HostMaintenance public APIs rcharles@uber.com

0.8.8

0.8.7

0.8.6

  • 2019-05-23 Placement hints for batch jobs amitbose@uber.com
  • 2019-05-22 Skip recovery if job not present in job_config table kevinxu@uber.com
  • 2019-05-21 Fix vcluster typo zhixin@uber.com
  • 2019-05-21 Fix vcluster typo zhixin@uber.com
  • 2019-05-21 Add auth support in peloton performance test zhixin@uber.com
  • 2019-05-21 Implement limit in GetPod API request to fetch only a subset of the previous runs apoorvaj@uber.com
  • 2019-05-21 Split up GetStateCount in job cache to GetTaskStateCount and GetWorkflowStateCount apoorvaj@uber.com
  • 2019-05-20 Always retry terminated pod no matter workflow fails or not zhixin@uber.com
  • 2019-05-18 Add watch service to publish mesos task status update events arpit.goyal@uber.com
  • 2019-05-17 Publish workflow metrics apoorvaj@uber.com
  • 2019-05-16 Disable xdist parallelism at canary test framework varung@uber.com
  • 2019-05-16 Workflow rollback record instance events for the last batch of instances zhixin@uber.com
  • 2019-05-16 Collect yarpc error code metrics binz@uber.com
  • 2019-05-15 Add leadership middleware to check leadership callback is complete varung@uber.com
  • 2019-05-15 Fix canary test framework dedupe unique job issue varung@uber.com
  • 2019-05-15 Add failure test for stateless jobs to ensure all peloton daemons are invulnerable to failures varung@uber.com
  • 2019-05-15 Add dedupe in read workflow events path varung@uber.com
  • 2019-05-15 Disable starting state timeout for service job zhixin@uber.com
  • 2019-05-14 Re-added kind with wget so it doesn't fail pourchet@uber.com
  • 2019-05-14 Update yarpc and grpc version zhixin@uber.com
  • 2019-05-14 Fix QueryJobCache log level zhixin@uber.com
  • 2019-05-14 Make entity version argument optional in the job stateless replace CLI apoorvaj@uber.com
  • 2019-05-14 Implement the logic for host reservation in resource manager to mark tasks ready for host reservation and send tasks to placement. aihuaxu@uber.com
  • 2019-05-13 Add private QueryJobCache API zhixin@uber.com

0.8.5

0.8.4

  • 2019-04-24 Adjust resource limit for revocable job to reduce QoS preemption varung@uber.com
  • 2019-04-24 Fix thermos executor log read varung@uber.com
  • 2019-04-24 Remove GetLastRuntimeUpdateTime for timeout zhixin@uber.com
  • 2019-04-23 Change "bridge update label" to use UUID string kevinxu@uber.com
  • 2019-04-23 Specify resmgr config overrides for development environment sachins@uber.com
  • 2019-04-23 Add script to enable auth in cluster zhixin@uber.com
  • 2019-04-23 Fix flaky test__auto_rollback_with_pinned_instances__remove_instances sachins@uber.com
  • 2019-04-23 Do not publish event if pod state is INITIALIZED varung@uber.com
  • 2019-04-23 Improve aurora bridge write api logging varung@uber.com
  • 2019-04-22 Remove unused SecretMetrics zhixin@uber.com
  • 2019-04-19 Implement TaskConfig for getTasksWithoutConfigs previous runs kevinxu@uber.com
  • 2019-04-19 Add more aurorabridge integration tests to test auto-rollback of pinned instances sachins@uber.com
  • 2019-04-19 Reduce host pruning period from 10m to 2m varung@uber.com
  • 2019-04-19 Clean up job index table in JobRecover for deleted jobs sachins@uber.com
  • 2019-04-19 QueryJobs API should not fail when update-fetch from store fails for one of the jobs sachins@uber.com
  • 2019-04-19 Add aurorabridge integration tests to test auto-rollback of pinned instances sachins@uber.com
  • 2019-04-18 Add auth for aurora bridge as internal component varung@uber.com
  • 2019-04-18 Pinned instance job spec creation optimizations kevinxu@uber.com
  • 2019-04-18 Cannot update with start_pods set when job is being killed and vice versa zhixin@uber.com
  • 2019-04-18 Skip test__simple_update_with_restart_component due to flakiness varung@uber.com
  • 2019-04-17 Fix for getConfigSummary endpoint when pinned instance is used kevinxu@uber.com
  • 2019-04-17 Use PodStatus to generate published pod events kevinxu@uber.com
  • 2019-04-17 Kill all tasks in job before untrack zhixin@uber.com
  • 2019-04-17 Part I: Start using ORM for job_config adityacb@uber.com
  • 2019-04-17 Remove skip for test_update.py::test__simple_update_with_restart_component varung@uber.com
  • 2019-04-17 Implement hack label for pinned instances kevinxu@uber.com
  • 2019-04-17 Add integration test to abort an auto-rollback and start a new update sachins@uber.com
  • 2019-04-17 Call job runtime updater for stateless JobUntrackCall JobRuntimeUpdater upon JobUntrack for stateless job zhixin@uber.com
  • 2019-04-17 Remove skip for test_rollback.py::test__job_create_manual_rollback varung@uber.com
  • 2019-04-17 Change key of host-to-tasks map to mesos-task-id from peloton-task-id sachins@uber.com
  • 2019-04-16 Add flag to only return current pod_info for GetPod API kevinxu@uber.com
  • 2019-04-16 Fix formatting issue with job.py::wait_for_terminated varung@uber.com
  • 2019-04-16 Add randomization, run tests twice and dump daemon logs varung@uber.com
  • 2019-04-16 Add ORM method for iterative get-all amitbose@uber.com
  • 2019-04-16 Limit number of updates to query per job for aurorabridge kevinxu@uber.com
  • 2019-04-16 Create separate CI for aurora bridge varung@uber.com
  • 2019-04-16 Add integration test to restart different peloton daemons for on-going job update varung@uber.com
  • 2019-04-15 Add travis config file amitbose@uber.com
  • 2019-04-15 Duplicate active update is a noop zhixin@uber.com
  • 2019-04-15 Add auth support in all peloton components zhixin@uber.com
  • 2019-04-15 Mimir placement plugin host filtering amitbose@uber.com
  • 2019-04-12 Specify InvalidEntityVersionError to be part of API zhixin@uber.com
  • 2019-04-11 Implement pinned instance pod spec reading and attaching kevinxu@uber.com
  • 2019-04-11 Convert TaskConfig / PodSpec compare unit test to table test kevinxu@uber.com
  • 2019-04-11 Grammar fix for compare util functions kevinxu@uber.com
  • 2019-04-11 Deprecate respool path in QuerySpec which is not used in JobQuery zhixin@uber.com
  • 2019-04-10 Set per-instance instance event limit for ListJobWorkflows call kevinxu@uber.com
  • 2019-04-10 Sort TaskConfig list fields before converting to PodSpec kevinxu@uber.com
  • 2019-04-10 Change GetWorkflowEventsRequest.limit and ListJobWorkflowsRequest.instance_events_limit to uint32 type sachins@uber.com
  • 2019-04-10 Add PodSpec diff util function kevinxu@uber.com
  • 2019-04-10 Add CHANGELOG for 0.8.3.1 zhixin@uber.com
  • 2019-04-10 Add PodConfigurationStateStats in job status zhixin@uber.com
  • 2019-04-10 Deprecate Mesos references in the v1alpha API apoorvaj@uber.com
  • 2019-04-09 Kill a terminated job to prevent restart before delete zhixin@uber.com
  • 2019-04-09 Add util method for merging default and instance PodSpec kevinxu@uber.com
  • 2019-04-09 Add v0 <=> v1 task constraint nil field check kevinxu@uber.com
  • 2019-04-09 Don't fail perf report generation if a test fails amitbose@uber.com
  • 2019-04-08 Add integration test to redeploy update after abort varung@uber.com
  • 2019-04-08 Add Changelog for release 0.8.3 rcharles@uber.com
  • 2019-04-08 Add option to specify limit while requesting workflow events sachins@uber.com

0.8.3.1

0.8.3

0.8.2.1

  • 2019-03-11 Fix vcluster's use of peloton cluster config path avyas@uber.com

0.8.2

0.8.1

0.8.0

0.7.8.1

0.7.8

0.7.7.3

0.7.7.2

0.7.7.1

0.7.7

0.7.6

0.7.5.2

  • 2018-09-12 Add pod events handle empty desired mesos task id zhixin@uber.com.
  • 2018-09-11 Add override url for hostmgr in Peloton Client avyas@uber.com.

0.7.5.1

0.7.5

0.7.4

  • 2018-08-20 Fix 'context' of RestoreMaintenanceQueue call in resmgr/recovery.go sachins@uber.com
  • 2018-08-20 Instantiate finished channel outside of NewRecovery avyas@uber.com
  • 2018-08-20 Allow to disable Prometheus while maintaining metric name format rcharles@uber.com
  • 2018-08-20 Integrate health check with update process chunyang.shen@uber.com
  • 2018-08-17 Increase the memlimit for peloton apps on vcluster avyas@uber.com
  • 2018-08-16 Change "total instance count is greater than expected" to debug zhixin@uber.com
  • 2018-08-16 Fix update integration test zhixin@uber.com
  • 2018-08-15 Make MarkHostsDrained API call only for 'DRAINING' hosts sachins@uber.com
  • 2018-08-15 Unit tests for yarpc/peer amitbose
  • 2018-08-14 Remove executor code from peloton because it is not used adityacb@uber.com
  • 2018-08-14 Update PodEvent protobuf to use mesosTaskID rather pelotonTaskID varung@uber.com
  • 2018-08-14 Adding tests for resource manager recovery mabansal@uber.com
  • 2018-08-14 Update logrus version to ^1.0.0 avyas@uber.com
  • 2018-08-14 Disable prometheus reporting for resource manager avyas@uber.com
  • 2018-08-14 Add integration test for update zhixin@uber.com
  • 2018-08-13 Enable archiver streaming only mode, add unit tests adityacb@uber.com
  • 2018-08-13 Removing interfaces from hostmanager reserver and cleanup code mabansal@uber.com
  • 2018-08-13 [API] Set task kill grace period per each task adityacb@uber.com
  • 2018-08-13 UpdateRun only checks currently updating instances zhixin@uber.com
  • 2018-08-13 Restore host->tasks map on resmgr recovery sachins@uber.com
  • 2018-08-12 Dual write state transitions to pod_events and task_events for batch jobs varung@uber.com
  • 2018-08-10 Check nil for cached job zhixin@uber.com
  • 2018-08-09 Measure resource pool SLA at a more granular level avyas@uber.com
  • 2018-08-09 Add UNKNOWN and DISABLED into health state and persist health state into pod_events table chunyang.shen@uber.com
  • 2018-08-09 Add unit-tests for package util amitbose
  • 2018-08-09 Add unit-tests for package common/background amitbose
  • 2018-08-09 Terminated task with correct desired configuration is update complete zhixin@uber.com
  • 2018-08-08 Restore host->tasks map on resmgr recovery sachins@uber.com
  • 2018-08-08 Add tests for resmgr/server.go avyas@uber.com
  • 2018-08-06 Improve code coverage for common/logging varung@uber.com
  • 2018-08-06 Update the task healthy field only when event reason is REASON_TASK_HEALTH_CHECK_STATUS_UPDATED chunyang.shen@uber.com
  • 2018-08-06 Make vcluster run on dca1-preprod01 amitbose
  • 2018-08-06 Add unit test for tasksvc start and stop zhixin@uber.com
  • 2018-08-03 Add tests for reconciler avyas@uber.com
  • 2018-08-03 Revert "Revert "Record state transition durations for RMTask"" avyas@uber.com
  • 2018-08-03 Add comments for test coverage of JobMgr chunyang.shen@uber.com
  • 2018-08-03 Add job configuration validation for different type of job chunyang.shen@uber.com
  • 2018-08-02 Add CLI to get Pod Events varung@uber.com
  • 2018-08-02 Increase code coverage for hostmgr/reconcile sachins@uber.com
  • 2018-08-02 JobRuntimeUpdater treats job with update as non-paritally-created job zhixin@uber.com
  • 2018-08-02 Set task goal state to KILLED for terminated task zhixin@uber.com
  • 2018-08-02 Handle errors during update create apoorvaj@uber.com
  • 2018-08-02 Add unit test for tasksvc get zhixin@uber.com
  • 2018-08-02 Add unit test for jobmgr/tak/placement zhixin@uber.com
  • 2018-08-02 Add unit test for volumesvc zhixin@uber.com
  • 2018-08-02 Add unit test for tasksvc get events zhixin@uber.com
  • 2018-08-02 Increase code coverage for hostmgr/ sachins@uber.com
  • 2018-08-01 Increase code coverage for hostmgr/hostsvc sachins@uber.com
  • 2018-08-01 Increase code coverage for hostmgr/queue sachins@uber.com
  • 2018-08-01 Implement update pause on server side zhixin@uber.com
  • 2018-08-01 Do not start goal state engines and preemptor before recovery is complete apoorvaj@uber.com
  • 2018-07-31 Fix resource usage map panic for older tasks adityacb@uber.com
  • 2018-07-31 Add update pause in cli zhixin@uber.com

0.7.3

0.7.2 ------------------

0.7.1.3

0.7.1.2

0.7.1.1

0.7.1

0.7.0

  • 2018-06-13 Add secrets log formatter to redact secret data in logs adityacb@uber.com
  • 2018-06-13 Fixing flaky test case for Entitlement Calculation mabansal@uber.com
  • 2018-06-13 Record state transition durations for RMTask avyas@uber.com
  • 2018-06-13 Refactoring GetHosts in HostManager as well as some code cleanup mabansal@uber.com
  • 2018-06-13 Adding Reserver in placement engine for host reservation mabansal@uber.com
  • 2018-06-12 Refactor task_test.go to use test suite zhixin@uber.com
  • 2018-06-12 Refine Cassandra Table Attributes varung@uber.com
  • 2018-06-12 Increase placement engine worker threads varung@uber.com
  • 2018-06-12 Job Update/Get API now supports secrets adityacb@uber.com
  • 2018-06-11 Move v0 Peloton API to protobuf/peloton/api/v0 min@uber.com
  • 2018-06-08 Update changeLog version to max version plus one when update job config zhixin@uber.com
  • 2018-06-08 Fix deadlock in task tracker avyas@uber.com
  • 2018-06-07 Make job recovery failure a fatal error apoorvaj@uber.com
  • 2018-06-07 Add API to query job and task cache zhixin@uber.com
  • 2018-06-06 Job runtime and config read gets data from cache when possible zhixin@uber.com
  • 2018-06-06 Format code using gofmt avyas@uber.com
  • 2018-06-06 Make errChan buffered eq to len of jobsByBatch to prevent gorountine leak on recoverJobsBatch failure varung@uber.com
  • 2018-06-06 Fix populateSecrets bug when launching tasks from jobmgr adityacb@uber.com
  • 2018-06-05 Partially created job set to INITIALIZED and enqueue to goalstate engine zhixin@uber.com
  • 2018-06-05 Kill the orphan Mesos task before regenerate a new MesosTaskID chunyang.shen@uber.com
  • 2018-06-04 Add more logs for placement engine varung@uber.com
  • 2018-06-04 Remove statusUpdaterRM zhixin@uber.com
  • 2018-06-04 Add/cleanup logs to track stuck tasks after failure to launch in job manager apoorvaj@uber.com
  • 2018-06-04 All job config and runtime update go through cache zhixin@uber.com
  • 2018-06-04 Fix async pool test & minor code refactoring varung@uber.com
  • 2018-05-31 Reconcile jobs and tasks in KILLING state as well apoorvaj@uber.com
  • 2018-05-30 Add stop feature for async pool jobs varung@uber.com
  • 2018-05-30 Add host manager API support in pCluster for cli varung@uber.com
  • 2018-05-29 Fix GetJobConfig to use the correct configuration version apoorvaj@uber.com
  • 2018-05-28 Add metrics to goal state to help debugging apoorvaj@uber.com
  • 2018-05-25 Making consistent function calls in entitlement calculator and reducing duplicate code in tests mabansal@uber.com

0.6.14

0.6.13

0.6.12

  • 2018-04-19 Hide admission of non-preemptible jobs behind a flag Anant Vyas
  • 2018-04-20 Checking mesos taskId before removing task from tracker Mayank Bansal
  • 2018-04-19 Enable Aurora health check for Peloton Tengfei Mu
  • 2018-04-19 Update changelog for 0.6.12 Tengfei Mu
  • 2018-04-19 Fixing race condition between removing task from tracker and adding the same task with different mesos task id Mayank Bansal
  • 2018-04-19 Update health.leader when candidate is not leader Zhixin Wen
  • 2018-04-19 Add Host APIs Sachin Sharma
  • 2018-04-18 Add comment for channel 'finished' in resmgr/recovery.go Sachin Sharma
  • 2018-04-18 fix unit test broken by revert of 4533a25 Zhixin Wen
  • 2018-04-18 eventstream client send correct purgeOffset upon restart Zhixin Wen
  • 2018-04-18 unset completion time when task is running Zhixin Wen
  • 2018-04-17 Revert "Revert "Add 100k task per job limit to master code"" Aditya Bhave
  • 2018-04-17 Retry Do not recover FAILED jobs till archiver is committed Tengfei Mu
  • 2018-04-17 Revert "Rearchitect the job manager to use the cache and the goal state engine" Tengfei Mu
  • 2018-04-17 Revert "Do not recover FAILED jobs till archiver is committed." Tengfei Mu
  • 2018-04-17 Revert "Add 100k task per job limit to master code" Tengfei Mu
  • 2018-04-17 Revert "Fix completion time for jobs moving from PENDING to KILLED" Tengfei Mu
  • 2018-04-16 Fix completion time for jobs moving from PENDING to KILLED Aditya Bhave
  • 2018-04-16 Add max_retry_attempts for test__create_job to pass smoketest Chunyang Shen
  • 2018-04-12 Add 100k task per job limit to master code Aditya Bhave
  • 2018-04-13 enable host tags for metrics Zhixin Wen
  • 2018-04-10 Bump up C* timeouts and add timers to recovery code Aditya Bhave
  • 2018-04-12 Add Host Maintenance API Sachin Sharma
  • 2018-04-11 Change GC and compaction for tables with large partitions Aditya Bhave
  • 2018-04-10 Adding errorcodes in communication between resmgr and jobmgr for enqueuegangs Mayank Bansal
  • 2018-04-10 fix potential memory leak in priorityQueue Zhixin Wen
  • 2018-03-28 Make preemptor aware of non-preemptible tasks Anant Vyas
  • 2018-03-26 Admission control for non-preemptible gangs Anant Vyas
  • 2018-04-09 remove unused api.ResultSet to pass lint Zhixin Wen
  • 2018-04-09 Reconcile Staging Tasks Varun Gupta
  • 2018-04-05 Add script to do performance comparison betwwen two versions Chunyang Shen
  • 2018-04-04 Push to registry docker-registry02-sjc1:5055 Chunyang Shen
  • 2018-04-04 Do not recover FAILED jobs till archiver is committed. Apoorva Jindal
  • 2018-04-03 Fix docker build script and update ATG registry Chunyang Shen
  • 2018-03-22 Rearchitect the job manager to use the cache and the goal state engine Apoorva Jindal
  • 2018-04-02 Add a log when transient DB error occur on the hostmgr eventstream path Apoorva Jindal
  • 2018-03-29 Fix resmgr reason for state transition Apoorva Jindal
  • 2018-04-02 Update Glide installation in Makefile Chunyang Shen
  • 2018-03-26 Don't log UUID in sentry error Anant Vyas
  • 2018-03-04 Add a common library to implement a goal state engine Apoorva Jindal
  • 2018-03-23 Rename metric tag from type to result for success/fail Charles Raimbert
  • 2018-03-22 Delete job_index entry as part of DeleteJob Aditya Bhave
  • 2018-03-20 Address remaining review comments on in-memory DB Apoorva Jindal

0.6.11

  • 2018-03-21 Pin down YARPC version in glide to avoid uber fx Charles Raimbert
  • 2018-03-21 Use patched docker/libkv for ZooKeeper Leader Election Charles Raimbert
  • 2018-03-21 Use long running job fixture for test__stop_long_running_batch_job_immediately Anant Vyas
  • 2018-03-20 Modify GetTasksForJobAndStates to accept []TaskState parameter instead of []string Sachin Sharma
  • 2018-03-15 Add integration test for Job Query API Aditya Bhave
  • 2018-03-16 Do not update the state transition reason on dequeue from placement engine Apoorva Jindal
  • 2018-03-15 Correct scheduled task accounting in case of launch errors for maxRunningInstance feature Apoorva Jindal
  • 2018-03-04 Add cache to job manager. Apoorva Jindal
  • 2018-03-19 Adding support for static respool in Tree hierarchy and Entitlement Mayank Bansal
  • 2018-03-12 Add support to query jobs by timerange Aditya Bhave
  • 2018-03-08 Be able to teardown vcluster in any fail in launching or testing vcluster Chunyang Shen
  • 2018-03-14 Add runtime info to jobquery cli output Aditya Bhave
  • 2018-03-14 Always evaluate a job for maxRunningInstaces SLA irrespective of job runtime updater result Apoorva Jindal
  • 2018-03-13 Adding Static reservation type in to resourcepool config Mayank Bansal
  • 2018-03-08 Add integration tests for controller task Anant Vyas
  • 2018-03-06 Add a monitor job for vcluster to send data to M3 Chunyang Shen
  • 2018-03-08 Enable integration test for fetching logs of previous task runs of failed task Apoorva Jindal
  • 2018-03-09 Dividing entitlement calculation to phases and adding more tests to entitlement Mayank Bansal
  • 2018-03-07 Do not overwrite killed state for partially completed jobs Apoorva Jindal
  • 2018-03-08 Add 'task query' command to CLI to query on tasks(for a job) by state(s) Sachin Sharma
  • 2018-03-07 Fix race condition in state machine rollback Anant Vyas

0.6.10.5

0.6.10.4

  • 2018-03-06 Remove 7 day time span restriction from querying active jobs adityacb@uber.com

0.6.10.3

0.6.10.2

  • 2018-03-02 Revert DequeueGang to get CONTROLLER task as well avyas@uber.com

0.6.10.1

  • 2018-02-28 Revert "Add 'task query' command to CLI to query on tasks(for a job) by state(s)" rcharles@uber.com

0.6.10

0.6.9

0.6.8.2

  • 2018-02-06 Untrack failed tasks with goal state succeeded. Apoorva Jindal

0.6.8.1

  • 2018-02-03 Fix migrate script for job_index Aditya Bhave

0.6.8

  • 2018-02-02 Removing race between different transitions in state machine Mayank Bansal
  • 2018-02-02 Adding mesos quota support in cluster capacity call for host manager Mayank Bansal
  • 2018-01-31 Schema and DB change to speed up JobQuery Aditya Bhave
  • 2018-02-02 Adding Limit support for resource pools Mayank Bansal
  • 2018-01-31 Adding apidoc in docs folder from build Mayank Bansal
  • 2018-01-31 Adding peloton engdocs Mayank Bansal
  • 2018-01-02 Add extra logging in state machine implementation Anant Vyas
  • 2018-01-31 Changing api docs to html format Mayank Bansal
  • 2018-01-25 Ignore failure event due to duplicate task ID message from Mesos Apoorva Jindal
  • 2018-01-26 Send kill of PENDING tasks to resource manager Apoorva Jindal
  • 2018-01-24 Send initialized tasks during recovery as a batch to resource manager Apoorva Jindal
  • 2018-01-24 Guard against any case when hostname may be missing in offer pool. Zhitao Li
  • 2018-01-22 Add Script for performance test running Chunyang Shen
  • 2018-01-11 Fix sorting based on creation/completion time in job query Apoorva Jindal
  • 2018-01-24 Do not run job action with a context timeout. Apoorva Jindal
  • 2018-01-23 Revert "Temporarily, do not recover initialized tasks in non-initialized jobs in job manager" Apoorva Jindal
  • 2018-01-08 shutdown executor after task kill timeout Chunyang Shen

0.6.7

0.6.6

  • 2018-01-18 Change update task runtime success message to debug. apoorvaj@uber.com
  • 2018-01-18 PENDING tasks should not be re-sent to resource manager apoorvaj@uber.com
  • 2018-01-18 Cleanup in placement processor apoorvaj@uber.com
  • 2018-01-18 Do not update task runtime for all orphan tasks mu@uber.com
  • 2018-01-18 Add a stateful integration tests kejlberg@uber.com
  • 2018-01-18 Bugfix Mimir placement strategy and bump Mimir-lib kejlberg@uber.com
  • 2018-01-16 Fix make test. apoorvaj@uber.com
  • 2018-01-12 Task runtime information in the cache should be either nil or in sync with DB apoorvaj@uber.com
  • 2018-01-12 Mesos state STAGING maps to LAUNCHED state in peloton. apoorvaj@uber.com
  • 2018-01-12 Adding Demand metrics as well updating static metrics with dynamic metrics mabansal@uber.com
  • 2018-01-11 Do not recover old terminated batch jobs with unknown goal state. apoorvaj@uber.com
  • 2018-01-11 Automatically set GOMAXPROCS to match Linux container CPU quota avyas@uber.com
  • 2018-01-10 Fix stateful placement engine to dequeue and place stateful tasks mu@uber.com
  • 2018-01-09 Add a counter about number of hosts acquired and released on hostmgr zhitao@uber.com
  • 2018-01-09 Add update API and DB schema apoorvaj@uber.com
  • 2018-01-08 Add 50k & 100k tasks perf base jobs rcharles@uber.com
  • 2018-01-08 Fix logging for ELK ingestion rcharles@uber.com
  • 2018-01-08 Change the placement models so that they will be json serialized when using them in logging fields. kejlberg@uber.com
  • 2018-01-05 Adding API doc in peloton repo mabansal@uber.com
  • 2018-01-04 Add ability to preempt PLACING tasks avyas@uber.com
  • 2018-01-04 Use mimir placement strategy for stateful task placement mu@uber.com
  • 2018-01-04 The placement engine now returns failed tasks to the resource manager. kejlberg@uber.com
  • 2018-01-04 Add support for re-enqueuing unplaced tasks into the resource manager kejlberg@uber.com

0.6.5

  • 2018-01-03 Skip reschedule stateful task upon task lost event mu@uber.com
  • 2018-01-03 Refactor and add tests to resource manager respool pkg avyas@uber.com
  • 2018-01-03 Implemented API to get Task Events adityacb@uber.com
  • 2018-01-02 Make kill job faster and fix regression in create job apoorvaj@uber.com
  • 2018-01-02 Do not reschedule already scheduled INITIALIZED tasks apoorvaj@uber.com
  • 2017-12-29 Virtual Mesos cluster setup through Peloton Client chunyang.shen@uber.com
  • 2017-12-29 Add cli command to list and clean persistent volume mu@uber.com
  • 2017-12-29 Update volume state to be DELETED in resource cleaner mu@uber.com
  • 2017-12-29 Eable "shutdown executor" for hotmgr chunyang.shen@uber.com
  • 2017-12-28 Implement job stop using job goal state apoorvaj@uber.com
  • 2017-12-28 Take lock before reading/writing to job struct apoorvaj@uber.com
  • 2017-12-27 Acquire read lock before getting job in tracked manager apoorvaj@uber.com
  • 2017-12-27 Fix deployment script to ignore apps that doesn't exist mu@uber.com
  • 2017-12-26 Added mesos client for executor pourchet@uber.com
  • 2017-12-26 Add option to start stateful placement engine in deployment script mu@uber.com
  • 2017-12-22 Allow a job configuration without a default configuration. apoorvaj@uber.com
  • 2017-12-22 Add support for MaximumRunningInstances SLA configuration. apoorvaj@uber.com
  • 2017-12-21 Implement job recovery in goal state engine in job manager apoorvaj@uber.com
  • 2017-12-20 Move task state to PENDING after enqueuing it to resource manager. apoorvaj@uber.com
  • 2017-12-20 [hostmgr] Separate reporting between no offer and mismatch status. zhitao@uber.com
  • 2017-12-15 Move creation of tasks and recovery into job goal state apoorvaj@uber.com
  • 2017-12-15 Change scalar.Resources methods from pointer receiver to non-pointer. zhitao@uber.com

0.6.4

  • 2017-12-14 Skip terminal jobs during job manager sync from DB @apoorvaj
  • 2017-12-14 Added the mesos podtask @pourchet

0.6.3

  • 2017-12-14 Increase MaxRecvMsgSize in gRPC to 256MB @min
  • 2017-12-14 Merge the placement engine from the master branch into release @kejlberg
  • 2017-12-13 Move metrics gauage update to asynchronous @zhitao
  • 2017-12-13 Update volume state upon stateful task running status update @mu
  • 2017-12-13 Add more logging for jobmgr to launch stateful @mu
  • 2017-12-13 Fixing Integration test preprod cluster zk address @mabansal
  • 2017-12-13 Add reservation cleaner to clean both unused volume and resources @mu
  • 2017-12-13 Add job goal state to job manager @apoorvaj
  • 2017-12-13 Add materialized view for volume by state @mu

0.6.2

  • 2017-12-12 Adding more logging to entitlelement calculator in resmgr @Mayank Bansal
  • 2017-12-12 Revert "Check in mocks" @Antoine Pourchet
  • 2017-12-12 Adding deadline feature in Peloton @Mayank Bansal
  • 2017-12-08 Add changelog for changes between 0.5.0 and 0.6.0 @Anant Vyas

0.6.1

  • 2017-12-08 Improve Resource Manager recovery performance @Anant Vyas
  • 2017-12-06 Add materialized view for volumes by job ids @Tengfei Mu
  • 2017-12-07 Update task runtime state when receiving a mesos kill event @Apoorva Jindal
  • 2017-12-06 Do not update runtime reason on mesos update always @Apoorva Jindal
  • 2017-12-06 Move volumesvc from hostmgr to jobmgr @Tengfei Mu
  • 2017-12-07 Check in mocks @Tengfei Mu
  • 2017-12-05 Kill orphaned tasks in mesos @Apoorva Jindal
  • 2017-12-05 Implement volume list and delete API @Tengfei Mu
  • 2017-12-04 Add reason and message for every update to task runtime @Apoorva Jindal
  • 2017-12-04 Return failed instance list in task stop and task start @Apoorva Jindal
  • 2017-12-01 Handle task start of failed tasks @Apoorva Jindal
  • 2017-11-28 Restart the goal state when placement received for a task which needs to be killed. @Apoorva Jindal
  • 2017-11-29 Handle stopped tasks during reconcialiation. @Apoorva Jindal
  • 2017-12-01 Add yaml files for performance tests @Apoorva Jindal
  • 2017-11-30 Remove smoketest tag from preemption integ test @Anant Vyas
  • 2017-11-21 Porting storage changes from master to release @Apoorva Jindal