In some cases, the interface given in MAC_ADDR_MATCH_INTERFACE can be an
alias or altname. The test cannot use the altname, it must use the "real"
interface name.
For example, on some systems, if `MAC_ADDR_MATCH_INTERFACE=enX1`, the test
will fail because it is an altname for `ens4`:
```
+ ip addr show enX1
3: ens4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 52:54:00:12:34:57 brd ff:ff:ff:ff:ff:ff
altname enp0s4
altname enx525400123457
altname enX1
```
The test will now parse the output of `ip addr show $name` to get the real interface name.
Also, improve the fallback method to look for common secondary interface names
such as eth1 and ens4 in case MAC_ADDR_MATCH_INTERFACE is not one of these.
Signed-off-by: Rich Megginson <rmeggins@redhat.com>
NOTE: This also requires upgrading to tox-lsr 3.11.0
Ansible 2.19 will be released soon and has some changes which will
require fixes in system roles. This adds 2.19 to our testing matrix
on fedora 42 so that we can start addressing these issues.
python 3.13 is now being used on some platforms.
Using ansible-core 2.18 requires using py311 for pylint and other
python checkers.
Signed-off-by: Rich Megginson <rmeggins@redhat.com>
NOTE: This also requires upgrading to tox-lsr 3.11.0
Ansible 2.19 will be released soon and has some changes which will
require fixes in system roles. This adds 2.19 to our testing matrix
on fedora 42 so that we can start addressing these issues.
python 3.13 is now being used on some platforms.
Signed-off-by: Rich Megginson <rmeggins@redhat.com>
NOTE: This also requires upgrading to tox-lsr 3.10.0, and some
hacks to workaround a podman issue in ubuntu.
These tests run the role during a bootc container image build, deploy
the container into a QEMU VM, boot that, and validate the expected
configuration there. They run in two different tox environments, and
thus have to be run in two steps (preparation in buildah, validation in
QEMU). The preparation is expected to output a qcow2 image in
`tests/tmp/TESTNAME/qcow2/disk.qcow2`, i.e. the output structure of
<https://github.com/osbuild/bootc-image-builder>.
There are two possibilities:
* Have separate bootc end-to-end tests. These are tagged with
`tests::bootc-e2` and are skipped in the normal qemu-* scenarios.
They run as part of the container-* ones.
* Modify an existing test: These need to build a qcow2 image exactly
*once* (via calling `bootc-buildah-qcow.sh`) and skip setup/cleanup
and role invocations in validation mode, i.e. when
`__bootc_validation` is true.
In the container scenario, run the QEMU validation as a separate step in
the workflow.
See https://issues.redhat.com/browse/RHEL-88396
Signed-off-by: Rich Megginson <rmeggins@redhat.com>
Add Fedora 42 to testing farm test matrix, drop Fedora 40
Use tox-lsr 3.9.0 for the `--lsr-report-errors-url` argument.
Add the argument `--lsr-report-errors-url DEFAULT` to the qemu test so that
the errors will be written to the output log. This uses the output callback
https://github.com/linux-system-roles/auto-maintenance/blob/main/callback_plugins/lsr_report_errors.py
Use the check_logs.py script
https://github.com/linux-system-roles/auto-maintenance/blob/main/check_logs.py
with the `--github-action-format` argument to format the errors
in a github action friendly manner.
Rename the log files `-FAIL.log` or `-SUCCESS.log` depending on status.
This is compatible with the way the testing farm log files are named, and
makes it easy to tell if a test passed or failed from the log file name.
Upload README.html as artifacts of the build_docs job for debugging
Signed-off-by: Rich Megginson <rmeggins@redhat.com>
This will make the qemu/kvm tests be tested in either
ascending or descending ASCII order. This should give
us better test coverage of clean up scenarios which may
fail depending on the order of the previous tests.
Rename the qemu/kvm tests so that the statuses are shorter
and more intuitive.
Improve qemu/kvm test failure error reporting.
Signed-off-by: Rich Megginson <rmeggins@redhat.com>
These tests are problematic in github qemu tests, and that
functionality (scsi, anyway) in the testing farm integration
tests.
Yes, we should have a way to provide tags on a per-role basis . . .
Signed-off-by: Rich Megginson <rmeggins@redhat.com>
When running tests with a qemu managed node, the dhcp
used by qemu interferes with the dhcp used in the test, which
can cause the test to hang. Exclude the qemu interfaces from
using the test dhcp. Note that this only affects the qemu tests -
testing farm and other tests with "real" machines will have a
different mac address - the mac addresses used below are specific
to qemu virtual devices.
Also, just in case tests still timeout, add a tests/ansible.cfg
with a 240 second task timeout to ensure any hung tasks are killed.
This will cause the playbook to exit with an error.
Signed-off-by: Rich Megginson <rmeggins@redhat.com>
tox-lsr 3.6.0 will guarantee order of qemu test execution, which should
help make tests reproducible and help debug test failures.
Improve qemu test logging - this will help debug the qemu test
failures.
Signed-off-by: Rich Megginson <rmeggins@redhat.com>
Some systems do not use the `ethN` interface naming scheme, and
use `ensN` instead. The test wants to use `eth1` as the second
interface. If this does not exist, try `ens4` instead.
Some of our tests now run on an ubuntu control node (localhost)
and use `shell` to execute commands there. Ansible requires
the use of `pipefail`. The default shell on ubuntu is not
bash and does not have `pipefail`.
Signed-off-by: Rich Megginson <rmeggins@redhat.com>
The validation was incorrectly checking for routing rule attributes at the top-level
NM module instead of the NM.IPRoutingRule class. This was causing validation failures
because:
libnm's API has two core aspects:
1. NMConnection/NMSetting types for handling connection profiles
2. NMClient as a cache of D-Bus objects
The suppress_prefixlength and uid_range attributes are not part of the top-level NM
module but belong to NM.IPRoutingRule. Updated the validation to properly check for:
- set_suppress_prefixlength instead of NM_IP_ROUTING_RULE_ATTR_SUPPRESS_PREFIXLENGTH
- set_uid_range instead of NM_IP_ROUTING_RULE_ATTR_UID_RANGE_START
This aligns with the correct API usage and fixes the validation errors.
Resolves: https://issues.redhat.com/browse/RHEL-85872
Signed-off-by: Wen Liang <liangwen12year@gmail.com>
When a user provides both an interface name and a MAC address, the
current validation process retrieves sysfs link info separately using
the interface name and the MAC address, then compares the results. If
the information doesn't match, an error is raised.
However, this approach may trigger false alarms because retrieving the
link info by MAC might return the link info that only matches the
current MAC instead of the permanent MAC. Since the interface name is
unique within the kernel, a more robust validation method is to fetch
the MAC address using the interface name and then compare it directly
with the user-provided MAC address.
Resolves: https://issues.redhat.com/browse/RHEL-84362
Signed-off-by: Wen Liang <liangwen12year@gmail.com>
The link_info_find() function previously allowed searching for links by
MAC address, but this introduced ambiguity and could cause false alarms
in certain cases (e.g. retrieving the link info by MAC might return the
link info that only matches the current MAC instead of the permanent
MAC). To ensure reliable behavior, this function should accept and match
the link info only by interface name.
To address the issues, the following changes were made:
- Removed MAC address matching logic to eliminate ambiguity.
- Simplified the function to only check ifname, making it more
predictable.
- Updated all callers to adapt to this change, ensuring correctness.
- When a profile is tied to an interface via mac only, the validation of
the existence of interface will now be delegated to NetworkManager
instead.
Resolves: https://issues.redhat.com/browse/RHEL-84197
Signed-off-by: Wen Liang <liangwen12year@gmail.com>
The tests should not install anything from outside of the distribution
unless absolutely necessary, like the copr repos.
All of the EPEL dependencies have been removed or replaced
with coprs.
We do not need to install pytest from pip since it is available
as `pytest-3` from `python3-pytest`.
We do not need `git` or `rsync` in the tests.
Signed-off-by: Rich Megginson <rmeggins@redhat.com>
* Calculate number of managed nodes with this formula:
(( number_of_test_playbooks / 10 + 1 ))
* Add README explaining how to run the plan locally and remotely
Signed-off-by: Sergei Petrosian <spetrosi@redhat.com>
* You can ignore words inline by adding a comment like `# codespell:ignore word`.
* You can ignore words by adding them to the `.codespell_ignores` file.
* You can ignore files and directories by adding them with `skip = ` to the `.codespellrc` file.
Signed-off-by: Sergei Petrosian <spetrosi@redhat.com>
There is a new version of ansible-lint - v25.
Newer versions of ansible-lint require the collection requirements to be
installed so it can find the modules/plugins.
Enhance our ansible-lint ci job to provide the collection requirements,
including merging the runtime meta/collection-requirements.yml with
the testing tests/collection-requirements.yml.
This should somewhat mitigate the loss of ansible-plugin-scan.
We have to remove mock_modules that are actually present now.
Signed-off-by: Rich Megginson <rmeggins@redhat.com>
ansible-plugin-scan is broken due to lack of support for older versions
of python in ci.
One of the main reasons for using this scan is to check if the roles/tests
are using plugins that are not compatible with ansible 2.9. Since 2.9
is EOL, this is no longer necessary.
The other reason for using the scan is to check that the role/test
author has correctly listed dependencies in meta/collection-requirements.yml
and tests/collection-requirements.yml - that is - that the author has
correctly specified the dependencies for any plugins used that are
not built-in. This will mostly be caught in CI testing now.
Signed-off-by: Rich Megginson <rmeggins@redhat.com>
Updated the link_info_find method to prioritize matching links by
perm-address when it is valid and available. If the perm-address is
unavailable (None or "00:00:00:00:00:00"), the method falls back to
matching by address. Additionally, if ifname is provided, it takes
precedence and returns the corresponding linkinfo immediately.
The change resolves scenarios where multiple network interfaces might
share the same current MAC address (address), leading to potential
ambiguity in link matching. By prioritizing the permanent MAC address
(perm-address), the method provides a more precise and consistent match.
This is particularly crucial in environments with:
- MAC address spoofing or dynamic changes, where the current MAC
address may not reliably identify the interface.
- Virtual interfaces or VLANs, which often lack a valid perm-address
and rely on the parent interface's address.
- Ambiguity when multiple interfaces share the same address.
This change improves the robustness of MAC address matching by ensuring
that permanent addresses are prioritized while maintaining a reliable
fallback mechanism for interfaces with no permanent address.
Signed-off-by: Wen Liang <liangwen12year@gmail.com>
Add support for the `wait_ip` property, the system will consider
connection activated only when specific IP stack is configured.
This enables flexibility in scenarios such as
IPv6-only networks, where the overall network configuration can still
succeed when IPv4 configuration fails but IPv6 completes successfully.
The `wait_ip` can be configured with the following possible values:
* "any": System will consider interface activated when any IP stack is
configured.
* "ipv4": System will wait IPv4 been configured.
* "ipv6": System will wait IPv6 been configured.
* "ipv4+ipv6": System will wait both IPv4 and IPv6 been configured.
Resolves: https://issues.redhat.com/browse/RHEL-63026
Signed-off-by: Wen Liang <wenliang@redhat.com>
We have a lot of requests to support Rocky and Alma in various system roles. The
first part of adding support is adding `vars/` files for these platforms. In
almost every case, for a given major version N, the vars file RedHat_N.yml can
be used for CentOS, Rocky, and Alma. Rather than making a copy of the
RedHat_N.yml file, just use a symlink to reduce size and maintenance burden, and
standardize this across all system roles for consistency.
NOTE: There is no Alma or Rocky version 7 or less.
NOTE: OracleLinux is not a strict clone, so we are not going to do this for
OracleLinux at this time. Support for OracleLinux will need to be done in
separate PRs. For more information, see
https://github.com/linux-system-roles/cockpit/issues/130
**Question**: Why not just use `ansible_facts["os_family"] == "RedHat"`?
**Answer**: This is what Ansible uses as the RedHat os_family:
1e6ffc1d02/lib/ansible/module_utils/facts/system/distribution.py (L511)
There are a lot of distributions in there. I know that Fedora is not a clone of
RHEL, but it is very closely related. Most of the others are not clones, and it
would generally not work to replace ansible_distribution in ['CentOS', 'Fedora',
'RedHat'] with ansible_facts['os_family'] == 'RedHat' (but it would probably
work in specific cases with specific distributions). For example, OracleLinux
is in there, and we know that doesn't generally work. The only ones we can be
pretty sure about are `RedHat`, `CentOS`, `Fedora`, `AlmaLinux`, and `Rocky`.
**Question**: Does my role really need this because it should already work on
RHEL clones?
**Answer**: Maybe not - but:
* it doesn't hurt anything
* it's there if we need it in the future
* the role will be inconsistent with the other system roles if we don't have this
**Question**: Why do I need the `tests/vars/rh_distros_vars.yml` file? Doesn't
the test load the vars from the role?
**Answer**: No, the test does not load the vars from the role until the role is
included, and many tests use version and distribution before including the role.
**Question**: Do we need to change the code now to use the new variables?
**Answer**: No, not now, in subsequent PRs, hopefully by Alma and Rocky users.
Note that there may be more work to be done to the role to fully support Rocky
and Alma. Many roles have conditionals like this:
```yaml
some_var: "{{ 'some value' if ansible_distribution in ['CentOS', 'RedHat'] else 'other value' }}"
another_var: "{{ 'some value' if ansible_distribution in ['CentOS', 'Fedora', 'RedHat'] else 'other value' }}"
...
- name: Do something
when: ansible_distribution in ['CentOS', 'RedHat']
...
- name: Do something else
when: ansible_distribution in ['CentOS', 'Fedora', 'RedHat']
...
```
Adding Rocky and AlmaLinux to these conditionals will have to be done
separately. In order to simplify the task, some new variables are being
introduced:
```yaml
__$rolename_rh_distros:
- AlmaLinux
- CentOS
- RedHat
- Rocky
__$rolename_rh_distros_fedora: "{{ __$rolename_rh_distros + ['Fedora'] }}"
__$rolename_is_rh_distro: "{{ ansible_distribution in __$rolename_rh_distros }}"
__$rolename_is_rh_distro_fedora: "{{ ansible_distribution in __$rolename_rh_distros_fedora }}"
```
Then the conditionals can be rewritten as:
```yaml
some_var: "{{ 'some value' if __$rolename_is_rh_distro else 'other value' }}"
another_var: "{{ 'some value' if __$rolename_is_rh_distro_fedora else 'other value' }}"
...
- name: Do something
when: __$rolename_is_rh_distro | bool
...
- name: Do something else
when: __$rolename_is_rh_distro_fedora | bool
...
```
For tests - tests that use such conditionals will need to use `vars_files` or
`include_vars` to load the variables that are defined in
`tests/vars/rh_distros_vars.yml`:
```yaml
vars_files:
- vars/rh_distros_vars.yml
```
We don't currently have CI testing for Rocky or Alma, so someone wanting to run
tests on those platforms would need to change the test code to use these.
Signed-off-by: Rich Megginson <rmeggins@redhat.com>
There is no fine-grained control over the number of retries for
automatically reconnecting a network connection in the role. This
limitation can be problematic for certain use cases where extending the
retry process is critical, particularly in environments with unstable
networks. Introduce support for the `autoconnect_retries` property in nm
provider of `network_connections` variable. This feature allows users to
configure how many times NetworkManager will attempt to reconnect a
connection after a autoconnect failure, providing more control over
network stability and performance.
Resolves: https://issues.redhat.com/browse/RHEL-61599
Signed-off-by: Wen Liang <liangwen12year@gmail.com>
* Add "BusinessUnit": "system_roles" environment setting to tag our jobs in Testing farm
* Add tmt_plan_filter to run additional workflows besides general
* Allow more [citest bad] comment formats
* Get memory and supported platforms info from the PR ref
* Move LINUXSYSTEMROLES_USER to vars and use it everywhere in tft.yml
* Remove extra GITHUB_ORG definition
Signed-off-by: Sergei Petrosian <spetrosi@redhat.com>
For an ethernet device which contains the kernel link, we should not
and cannot delete such a device using `network_state` variable.
We can only use `network_state` variable to delete the virtual NIC that
is created by NM/Nmstate.
Signed-off-by: Wen Liang <liangwen12year@gmail.com>