Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ansible playbook to clean nodes in inventory #104

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 50 additions & 0 deletions Linux/jenkins-node/ansible/clean-jenkins-agents.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
- name: Playbook to clean jenkins agents by removing workspaces.
hosts: all

Copy link
Contributor

@warunawickramasingha warunawickramasingha Feb 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will be useful if there is an additional section to log the directories that would be deleted via this. For example, having some logging tasks that will always run regardless of the tag specified would list the existing directories along with their sizes.

The same can be used to check the existing dirs before deleting as well(By first running without giving any tags).

Suggested change
- name: Collect existing dir names and their sizes
community.docker.docker_container_exec:
container: "{{ agent_name }}"
command: bash -l -c "du -sh * | awk '{print $1, $2}'"
chdir: "/jenkins_workdir/workspace"
register: dirs_and_sizes
become: yes
tags: [always]
- name: Log existing dirs with their sizes
debug:
msg: "{{ dirs_and_sizes.stdout_lines }}"
tags: [always]

# Tags available: pr, nightly, package, docs
tasks:
- name: Collect workspace sizes
community.docker.docker_container_exec:
container: "{{ agent_name }}"
command: bash -l -c "du -sh * | column -t"
chdir: "/jenkins_workdir/workspace"
register: du_result
become: yes
tags: [always]

- name: Display workspace sizes
debug:
msg: "{{ du_result.stdout_lines }}"
tags: [always]

- name: Remove PR directories
community.docker.docker_container_exec:
container: "{{ agent_name }}"
command: bash -l -c "rm -rf pull_requests*"
chdir: "/jenkins_workdir/workspace"
become: yes
tags: [never, pr]

- name: Remove Nightly directories
community.docker.docker_container_exec:
container: "{{ agent_name }}"
command: bash -l -c "rm -rf *_nightly_deployment*"
chdir: "/jenkins_workdir/workspace"
become: yes
tags: [never, nightly]

- name: Remove Packages from Branch directories
community.docker.docker_container_exec:
container: "{{ agent_name }}"
command: bash -l -c "rm -rf build_packages_from_branch*"
chdir: "/jenkins_workdir/workspace"
become: yes
tags: [never, package]

- name: Remove Docs Build directories
community.docker.docker_container_exec:
container: "{{ agent_name }}"
command: bash -l -c "rm -rf build_and_publish_docs*"
chdir: "/jenkins_workdir/workspace"
become: yes
tags: [never, docs]
40 changes: 27 additions & 13 deletions Linux/jenkins-node/ansible/readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,17 +7,17 @@ New Linux nodes for use on Jenkins are now set up using ansible scripts in this
- Ensure you have activated a conda environment with ansible (you may need to use the conda environment set up for use with the ansible-linode repo).
- Set up a new node on Jenkins. The easiest way to set up a jenkins node is to copy an existing node: From the jenkins menu, select `New Item`, type in the name for your node, then scroll down to the `copy from` box and enter the node name you wish to copy.
- To get the secret for your newly set up node, select your new node from the `Build Executor Status` pane on the left hand side of the jenkins home page. The secret should be displayed as part of the command in the box below `Run from agent command line:`. Note, if you are setting up a node directly on the staging server you will have to use the following command in the jenkins console (`<jenkins_url>/script`) to obtain the secret:
```
```java
jenkins.model.Jenkins.getInstance().getComputer("<jenkins node name>").getJnlpMac()
```
- navigate to the [`Ansible folder`](https://github.com/mantidproject/dockerfiles/tree/main/Linux/jenkins-node/ansible)
- Update the `inventory.txt` file with the IP address, agent name and agent secret (i.e. Jenkins secret code). Be sure to save this file when update complete!
- If creating staging nodes, update the `jenkins-agent-staging.yml` file to specify the correct `jenkins_url`, and `jenkins_identity` variables. To get the `jenkins_identity` use the following command in the jenkins console:
```
```java
hudson.remoting.Base64.encode(org.jenkinsci.main.modules.instance_identity.InstanceIdentity.get().getPublic().getEncoded())
```
- Run the following command, replacing `staging` with `production` if appropriate and replacing `FedID` with your FedID.
```
```sh
ansible-playbook -i inventory.txt jenkins-agent-staging.yml -u FedID -K
```
- you will be asked for a password - enter your FedID password
Expand All @@ -30,37 +30,51 @@ When running the `jenkins-agent-staging.yml` or `jenkins-agent-production.yml` p
- `agent`: roles tagged with the `agent` tag deploy the docker container that constitutes the jenkins agent.

To use a tag, you pass it in to the `ansible-playbook` command with the `-t` flag. For example, if you have already set up the host machine and just want to deploy a jenkins agent:
```
```sh
ansible-playbook -i inventory.txt jenkins-agent-staging.yml -u FedID -K -t agent
```

## Cleaning nodes

Before cleaning any nodes mark them temporarily offline on Jenkins and ensure no jobs are running on them before cleaning.
- Before cleaning any nodes mark them temporarily offline on Jenkins and ensure no jobs are running on them before cleaning.

- Update the `inventory.txt` file as above, including only the nodes you intend to clean.

- The tasks in the cleaning playbook make use of tags to restrict what is cleaned:

- `pr`: Pull Requests.
- `nightly`: Nightly deployments for main and release next.
- `package`: Build Packages from Branch
- `docs`: Docs build and publish.

The easiest way to clean nodes is using a groovy script on Jenkins. Use the links below for guidance
- [`Remove directories across multiple nodes`](https://developer.mantidproject.org/JenkinsConfiguration.html#remove-directories-across-multiple-nodes)
- [`Remove directories from single node`](https://developer.mantidproject.org/JenkinsConfiguration.html#remove-directories-from-single-node)
- Run the following with the desired tags (which use a comma-separated list):
```sh
ansible-playbook -i inventory.txt clean-jenkins-agents.yml -u FedID -K -t pr,nightly,package,docs
```

- Set the nodes you shut down back online.

If this does not work you may need to spin up a new docker container. Use the instructions for Changing docker image below.
### Troubleshooting

- If this does not work you may need to spin up a new docker container. Use the instructions for Changing docker image below.

## Changing docker image

If you need to update the docker image or spin up a new docker container on a Linux machine - follow these instructions. Before starting mark any nodes you will be changing as temporarily offline on Jenkins and ensure no jobs are running on them.

- ssh into the node
- Stop and remove the container using the following command, replacing machinename with the appropriate name e.g. `isis-cloud-linux-1`
```
```sh
docker stop machinename && docker rm machinename
```
```sh
- Remove any associated volumes using the following command, again replacing machinename with the appropriate name
```
```sh
docker volume rm machinename
```
- for cloud machines, close the ssh connection and follow the instructions for setting up cloud nodes above.
- for physical machines navigate to the folder that contains the `deploy.sh` script. This is normally `dockerfiles/jenkins-node/bin`.
- Run the following command (you may be able to find it using reverse search)
```
```sh
./deploy.sh machinename agent_secret "https://builds.mantidproject.org" latest 50G
```
- on success close the ssh connection and check node connects on Jenkins
Expand Down