Merge "Add debug guide"

This commit is contained in:
Zuul 2020-06-11 16:16:39 +00:00 committed by Gerrit Code Review
commit f5cbb7ba7e
3 changed files with 139 additions and 6 deletions

View File

@ -33,6 +33,7 @@ Additional StarlingX-specific resources are listed below.
development_process
../developer_resources/code-submission-guide
../developer_resources/debug_issues
--------------------
Additional resources

View File

@ -0,0 +1,131 @@
======================
Debug StarlingX Issues
======================
This guide contains some basic steps for debugging issues on StarlingX.
.. contents::
:local:
:depth: 1
----------------
Record the issue
----------------
Record information about the issue so it can be reproduced during debugging. The
items below describe some issue characteristics to capture.
* Deployment issue type, such as bootstrap failure, provisioning failure, or
functional failures.
* Check the StarlingX version with the command:
::
cat /etc/build.info
* Check the StarlingX deployment configuration, such as: Simplex, Duplex,
Multi-node, by viewing the platform configuration file:
::
cat /etc/platform/platform.conf
* Server type, such as bare metal server(s) or VMs.
* Hardware device types and characteristics, such as NICs, PCI cards, # of
hard disks, and RAM size.
* Other aspects of the issue include: steps for reproducing, expected results,
actual results, and so on.
* Can the issue be reproduced regularly or occasionally?
* Gather log files and configuration files using the ``collect`` command.
---------------------
Check status and logs
---------------------
* Log in to the active controller.
* Check services using the ``sm-dump`` command:
::
sudo sm-dump
* Check services using the ``systemctl`` command.
* Apply the platform environment for ``sysadmin`` using:
::
source /etc/platform/openrc
* Check alarms from Fault-Manager using:
::
fm alarm-list --uuid
fm alarm-show <uuid>
* Search for errors in ``/var/log``.
* You **must** check ``/var/log/sysinv.log`` for errors.
* You can get hints from ``sysinv.log`` for many deployment failures.
* Look into other log files based on the functional area.
* If a functional area log file includes errors, check the associated
configuration file, which is typically located under the ``/etc/``
subdirectory.
* You may need to enable the ``debug`` option in the configuration file.
----------------
Debug and triage
----------------
* Check the Kubernetes status for: node, pod/job, endpoint, services, secret,
configmap.
* Check the two major namespaces: kube-system, openstack
* If issues occur inside containerized components, you need to enter the
service using the ``kubectl exec`` command.
---------------
Implement fixes
---------------
* You can try to resolve the issue by manually making some online
changes without rebooting Linux or even re-deploying StarlingX. For
example, you can modify system config files or the StarlingX
config/database. You can make the changes and restart the corresponding
services using the ``systemctl`` command or the StarlingX ``sm`` (service
management) command.
* If the fixes must be put on certain nodes (controller, worker, storage),
you can temporarily **lock** that node, make changes using StarlingX
commands, and then **unlock** the lock, to make the changes take effect.
* If the changes must be made in C/C++/Go code, you can:
* Make the changes in your *development workspace* with the StarlingX
codebase.
* Build the related packages using ``build-pkgs <package_name>``.
* Create and apply the patch using the :doc:`starlingx_patching` guide.
* Restart the services using the ``systemctl`` command or the StarlingX
``sm`` (service management) command.
--------------------
Additional resources
--------------------
* Review the `StarlingX Discuss list <http://lists.starlingx.io/pipermail/starlingx-discuss/>`_
for similar questions and workarounds from the community.
* Check the `StarlingX Launchpad <https://launchpad.net/starlingx>`_ for
similar issues and potential workarounds.
* Open a new `StarlingX Launchpad <https://launchpad.net/starlingx>`_ item to
report a bug.

View File

@ -10,16 +10,17 @@ Developer Resources
build_guide
Layered_Build
backup_restore
build_docker_image
code-submission-guide
debug_issues
stx_tsn_in_kata
mirror_repo
move_to_new_openstack_version_in_starlingx
navigate_source_code
Project Specifications <https://docs.starlingx.io/specs/>
architecture_docs
starlingx_patching
build_docker_image
move_to_new_openstack_version_in_starlingx
mirror_repo
backup_restore
Project Specifications <https://docs.starlingx.io/specs/>
stx_ipv6_deployment
stx_tsn_in_kata