User Tools

Site Tools


chara:trouble_shooting

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
chara:trouble_shooting [2018/09/25 18:57]
gail_stargazer
chara:trouble_shooting [2019/05/24 19:41]
gail_stargazer
Line 28: Line 28:
 ==== Restarting Servers using the bootlaunch paradigm ==== ==== Restarting Servers using the bootlaunch paradigm ====
  
- \\ A number of servers use an interim bootlaunch paradigm to restart. This is confined to servers that run on ubuntu machines, namely the telescope bunker computers and gps. The basic syntax is "bootlaunch_<server>" where "<server>" is replaced by the server the script is designed to address. The scripts have a number of safeties built in, so it is safe to run them even if a server is already running – they just output the process ID of the running server. The scripts also take care of the entry in socket manager as well any serial port lock files. All the pertinent information is world writeable, so one should be able to run a bootlaunch script as observe. \\  \\ One thing of note about the output of the bootlaunch scripts, they call a number of other programs which themselves have output that may be misleading in the context of bootlaunch. Chief among these is the output of "tsockman". If a server stopped unexpectedly, it may leave behind an entry in the socket manager. In order to launch a new server, one needs to clean out the socket manager entry if it is there. To do that, "tsockman remove <entry>" is called to remove "<entry>" before the new server is launched. If there is no entry, tsockman will respond with "Process by that name does not exist". THIS IS NORMAL and is not indicative of an error. The server in question launched (without fanfare) right after that output text. \\  \\ Here are the available bootlaunch scripts as of June 2017: \\  \\ gps computer:+If a server is not running or Socket Manager reports that a server is dead, then look at the socket manager list to find out what computer the server runs on ([[:chara:socket_manager_list_file|socket_manager.list]]). You can also look at the up-to-date file by opening a terminal window and typing "less /ctrscrut/chara/etc/socket_manager/socket_manager.list"
 + 
 +To restart a server, log on to the machine that runs the server and type "bootlaunch_master" This script will go through the list of executables and will check which servers are running.  If a server isn't running it bootlaucn_master will remove it from socket manager, clear the lock file, and relaunch the server.  The instructions below describe how to restart individual servers, but this should not be necessary anymore. \\  \\ A number of servers use an interim bootlaunch paradigm to restart. This is confined to servers that run on ubuntu machines, namely the telescope bunker computers and gps. The basic syntax is "bootlaunch_<server>" where "<server>" is replaced by the server the script is designed to address. The scripts have a number of safeties built in, so it is safe to run them even if a server is already running – they just output the process ID of the running server. The scripts also take care of the entry in socket manager as well any serial port lock files. All the pertinent information is world writeable, so one should be able to run a bootlaunch script as observe. \\  \\ One thing of note about the output of the bootlaunch scripts, they call a number of other programs which themselves have output that may be misleading in the context of bootlaunch. Chief among these is the output of "tsockman". If a server stopped unexpectedly, it may leave behind an entry in the socket manager. In order to launch a new server, one needs to clean out the socket manager entry if it is there. To do that, "tsockman remove <entry>" is called to remove "<entry>" before the new server is launched. If there is no entry, tsockman will respond with "Process by that name does not exist". THIS IS NORMAL and is not indicative of an error. The server in question launched (without fanfare) right after that output text. \\  \\ Here are the available bootlaunch scripts as of June 2017: \\  \\ gps computer:
  
   * bootlaunch_beamsamp – Starts the beam sampler servers, BS1 and BS2.   * bootlaunch_beamsamp – Starts the beam sampler servers, BS1 and BS2.
Line 45: Line 47:
 ==== Restarting Servers using the rc.local file ==== ==== Restarting Servers using the rc.local file ====
  
- \\ This procedure is applicable to servers that have not switched over to the bootlaunch paradigm. \\  \\ If a server is not running or Socket Manager reports that a server is dead, then look at the socket manager list to find out what computer the server runs on ([[:chara:socket_manager_list_file|socket_manager.list]]). You can also look at the up-to-date file by opening a terminal window and typing "less /chara/observe/socket_manager.list" Note that servers can be running fine, but if the Socket Manager drops the connection to them, they are as good as dead when it comes to functioning with other servers or as part of a larger sequence. \\+ \\ This procedure is applicable to servers that have not switched over to the bootlaunch paradigm. \\  \\ If a server is not running or Socket Manager reports that a server is dead, then look at the socket manager list to find out what computer the server runs on ([[:chara:socket_manager_list_file|socket_manager.list]]). You can also look at the up-to-date file by opening a terminal window and typing "less /ctrscrut/chara/etc/socket_manager/socket_manager.list" Note that servers can be running fine, but if the Socket Manager drops the connection to them, they are as good as dead when it comes to functioning with other servers or as part of a larger sequence. \\
  \\  \\
 Log on to the relevant computer by typing the computer name (ctrscrut, ople, s1, …). If the shortcut doesn't work then type "ssh //name//" where name is the computer name. \\  \\ Find out if the server is running by typing "ps aux | grep //server_name//" where server_name is the name of the server. \\ [ctrscrut:599] ps aux | grep pico_1 \\ observe 9281 0.0 0.0 61156 692 pts/3 S+ 13:58 0:00 grep pico_1 \\ observe 12578 0.0 0.0 24524 11212 ? S Jun16 33:14 /usr/local/bin/pico_server /dev/ttyC8 /ctrscrut/chara/etc/pico_1.cfg \\  \\ If the entry for the dead server shows up in the process list, then identify the process identification number (12578 for the example above) and kill the server by typing "kill -9 //PID//" where PID is the process identification number. \\  \\ Look up the commands to restart the server by typing "more /etc/rc.local" (this is relevant for servers that run in the background). Press the space bar to scroll through the contents of the rc.local file. Locate the commands relevant for the server that needs to be restarted and copy and paste into a terminal window: \\  \\ #Start PICO server for PICO #1 \\ /bin/rm -f /var/lock/LCK..ttyC8 \\ /usr/local/bin/tsockman remove PICO_1 \\ /usr/local/bin/pico_server /dev/ttyC8 /ctrscrut/chara/etc/pico_1.cfg & \\  \\ The first command removes the lock to allow the server to restart. The second command removes the name from the socket manager listing. The last command restarts the server. Note that if you are restarting the servers as observe, you will need to remove the part of the command in the rc.local file that saves information in /var/log///server_nam//e.log file (the actual command typed should resemble the last line above). \\  \\ There are text files on the desktop with many of the restart commands. Use these files for quick access to the relevant commands. The commands are edited and can be copied exactly as written. Files include Dome servers and all servers running on ctrscrut. Many of these commands are also located on the [[:chara:restarting_servers|Restarting Servers]] page. Log on to the relevant computer by typing the computer name (ctrscrut, ople, s1, …). If the shortcut doesn't work then type "ssh //name//" where name is the computer name. \\  \\ Find out if the server is running by typing "ps aux | grep //server_name//" where server_name is the name of the server. \\ [ctrscrut:599] ps aux | grep pico_1 \\ observe 9281 0.0 0.0 61156 692 pts/3 S+ 13:58 0:00 grep pico_1 \\ observe 12578 0.0 0.0 24524 11212 ? S Jun16 33:14 /usr/local/bin/pico_server /dev/ttyC8 /ctrscrut/chara/etc/pico_1.cfg \\  \\ If the entry for the dead server shows up in the process list, then identify the process identification number (12578 for the example above) and kill the server by typing "kill -9 //PID//" where PID is the process identification number. \\  \\ Look up the commands to restart the server by typing "more /etc/rc.local" (this is relevant for servers that run in the background). Press the space bar to scroll through the contents of the rc.local file. Locate the commands relevant for the server that needs to be restarted and copy and paste into a terminal window: \\  \\ #Start PICO server for PICO #1 \\ /bin/rm -f /var/lock/LCK..ttyC8 \\ /usr/local/bin/tsockman remove PICO_1 \\ /usr/local/bin/pico_server /dev/ttyC8 /ctrscrut/chara/etc/pico_1.cfg & \\  \\ The first command removes the lock to allow the server to restart. The second command removes the name from the socket manager listing. The last command restarts the server. Note that if you are restarting the servers as observe, you will need to remove the part of the command in the rc.local file that saves information in /var/log///server_nam//e.log file (the actual command typed should resemble the last line above). \\  \\ There are text files on the desktop with many of the restart commands. Use these files for quick access to the relevant commands. The commands are edited and can be copied exactly as written. Files include Dome servers and all servers running on ctrscrut. Many of these commands are also located on the [[:chara:restarting_servers|Restarting Servers]] page.
chara/trouble_shooting.txt · Last modified: 2023/11/21 01:42 by charaobs