Merge "Add debug guide"
This commit is contained in:
commit
f5cbb7ba7e
@ -33,6 +33,7 @@ Additional StarlingX-specific resources are listed below.
|
||||
|
||||
development_process
|
||||
../developer_resources/code-submission-guide
|
||||
../developer_resources/debug_issues
|
||||
|
||||
--------------------
|
||||
Additional resources
|
||||
|
131
doc/source/developer_resources/debug_issues.rst
Normal file
131
doc/source/developer_resources/debug_issues.rst
Normal file
@ -0,0 +1,131 @@
|
||||
======================
|
||||
Debug StarlingX Issues
|
||||
======================
|
||||
|
||||
This guide contains some basic steps for debugging issues on StarlingX.
|
||||
|
||||
.. contents::
|
||||
:local:
|
||||
:depth: 1
|
||||
|
||||
----------------
|
||||
Record the issue
|
||||
----------------
|
||||
|
||||
Record information about the issue so it can be reproduced during debugging. The
|
||||
items below describe some issue characteristics to capture.
|
||||
|
||||
* Deployment issue type, such as bootstrap failure, provisioning failure, or
|
||||
functional failures.
|
||||
|
||||
* Check the StarlingX version with the command:
|
||||
::
|
||||
|
||||
cat /etc/build.info
|
||||
|
||||
|
||||
* Check the StarlingX deployment configuration, such as: Simplex, Duplex,
|
||||
Multi-node, by viewing the platform configuration file:
|
||||
::
|
||||
|
||||
cat /etc/platform/platform.conf
|
||||
|
||||
* Server type, such as bare metal server(s) or VMs.
|
||||
|
||||
* Hardware device types and characteristics, such as NICs, PCI cards, # of
|
||||
hard disks, and RAM size.
|
||||
|
||||
* Other aspects of the issue include: steps for reproducing, expected results,
|
||||
actual results, and so on.
|
||||
|
||||
* Can the issue be reproduced regularly or occasionally?
|
||||
|
||||
* Gather log files and configuration files using the ``collect`` command.
|
||||
|
||||
|
||||
---------------------
|
||||
Check status and logs
|
||||
---------------------
|
||||
|
||||
* Log in to the active controller.
|
||||
|
||||
* Check services using the ``sm-dump`` command:
|
||||
::
|
||||
|
||||
sudo sm-dump
|
||||
|
||||
* Check services using the ``systemctl`` command.
|
||||
|
||||
* Apply the platform environment for ``sysadmin`` using:
|
||||
::
|
||||
|
||||
source /etc/platform/openrc
|
||||
|
||||
* Check alarms from Fault-Manager using:
|
||||
::
|
||||
|
||||
fm alarm-list --uuid
|
||||
fm alarm-show <uuid>
|
||||
|
||||
* Search for errors in ``/var/log``.
|
||||
|
||||
* You **must** check ``/var/log/sysinv.log`` for errors.
|
||||
* You can get hints from ``sysinv.log`` for many deployment failures.
|
||||
* Look into other log files based on the functional area.
|
||||
|
||||
* If a functional area log file includes errors, check the associated
|
||||
configuration file, which is typically located under the ``/etc/``
|
||||
subdirectory.
|
||||
|
||||
* You may need to enable the ``debug`` option in the configuration file.
|
||||
|
||||
----------------
|
||||
Debug and triage
|
||||
----------------
|
||||
|
||||
* Check the Kubernetes status for: node, pod/job, endpoint, services, secret,
|
||||
configmap.
|
||||
|
||||
* Check the two major namespaces: kube-system, openstack
|
||||
|
||||
* If issues occur inside containerized components, you need to enter the
|
||||
service using the ``kubectl exec`` command.
|
||||
|
||||
---------------
|
||||
Implement fixes
|
||||
---------------
|
||||
|
||||
* You can try to resolve the issue by manually making some online
|
||||
changes without rebooting Linux or even re-deploying StarlingX. For
|
||||
example, you can modify system config files or the StarlingX
|
||||
config/database. You can make the changes and restart the corresponding
|
||||
services using the ``systemctl`` command or the StarlingX ``sm`` (service
|
||||
management) command.
|
||||
|
||||
* If the fixes must be put on certain nodes (controller, worker, storage),
|
||||
you can temporarily **lock** that node, make changes using StarlingX
|
||||
commands, and then **unlock** the lock, to make the changes take effect.
|
||||
|
||||
* If the changes must be made in C/C++/Go code, you can:
|
||||
|
||||
* Make the changes in your *development workspace* with the StarlingX
|
||||
codebase.
|
||||
* Build the related packages using ``build-pkgs <package_name>``.
|
||||
* Create and apply the patch using the :doc:`starlingx_patching` guide.
|
||||
* Restart the services using the ``systemctl`` command or the StarlingX
|
||||
``sm`` (service management) command.
|
||||
|
||||
--------------------
|
||||
Additional resources
|
||||
--------------------
|
||||
|
||||
* Review the `StarlingX Discuss list <http://lists.starlingx.io/pipermail/starlingx-discuss/>`_
|
||||
for similar questions and workarounds from the community.
|
||||
|
||||
* Check the `StarlingX Launchpad <https://launchpad.net/starlingx>`_ for
|
||||
similar issues and potential workarounds.
|
||||
|
||||
* Open a new `StarlingX Launchpad <https://launchpad.net/starlingx>`_ item to
|
||||
report a bug.
|
||||
|
||||
|
@ -10,16 +10,17 @@ Developer Resources
|
||||
|
||||
build_guide
|
||||
Layered_Build
|
||||
backup_restore
|
||||
build_docker_image
|
||||
code-submission-guide
|
||||
debug_issues
|
||||
stx_tsn_in_kata
|
||||
mirror_repo
|
||||
move_to_new_openstack_version_in_starlingx
|
||||
navigate_source_code
|
||||
Project Specifications <https://docs.starlingx.io/specs/>
|
||||
architecture_docs
|
||||
starlingx_patching
|
||||
build_docker_image
|
||||
move_to_new_openstack_version_in_starlingx
|
||||
mirror_repo
|
||||
backup_restore
|
||||
Project Specifications <https://docs.starlingx.io/specs/>
|
||||
stx_ipv6_deployment
|
||||
stx_tsn_in_kata
|
||||
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user