User Tools

Site Tools


chara:restarting_servers

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
chara:restarting_servers [2017/06/29 20:33]
127.0.0.1 external edit
chara:restarting_servers [2018/07/05 10:55] (current)
jones [Restarting Servers using the rc.local file:]
Line 1: Line 1:
-====== **Restarting Servers using the bootlaunch paradigm** ======+======Restarting Servers using the bootlaunch paradigm======
 \\  A number of servers use an interim bootlaunch paradigm to restart. This is confined to servers that run on ubuntu machines, namely the telescope bunker computers and gps. The basic syntax is "bootlaunch_<server>" where "<server>" is replaced by the server the script is designed to address. The scripts have a number of safeties built in, so it is safe to run them even if a server is already running -- they just output the process ID of the running server. This scripts also take care of the entry in socket manager as well any serial port lock files. All the pertinent information is world writable, so one should be able to run a bootlaunch script as observe.\\ \\  One thing of note about the output of the bootlaunch scripts, they call a number of other programs which themselves have output that may be misleading in the context of bootlaunch. Chief among these is the output of "tsockman". If a server stopped unexpectedly, it may leave behind an entry in the socket manager. In order to launch a new server, one needs to clean out the socket manager entry if it is there. To do that, "tsockman remove <entry>" is called to remove "<entry>" before the new server is launched. If there is no entry, tsockman will respond with "Process by that name does not exist". THIS IS NORMAL and is not indicative of an error. The server in question launched (without fanfare) right after that output text.\\ \\  Here are the available bootlaunch scripts as of June 2017:\\ \\  gps computer:\\ \\  \\  A number of servers use an interim bootlaunch paradigm to restart. This is confined to servers that run on ubuntu machines, namely the telescope bunker computers and gps. The basic syntax is "bootlaunch_<server>" where "<server>" is replaced by the server the script is designed to address. The scripts have a number of safeties built in, so it is safe to run them even if a server is already running -- they just output the process ID of the running server. This scripts also take care of the entry in socket manager as well any serial port lock files. All the pertinent information is world writable, so one should be able to run a bootlaunch script as observe.\\ \\  One thing of note about the output of the bootlaunch scripts, they call a number of other programs which themselves have output that may be misleading in the context of bootlaunch. Chief among these is the output of "tsockman". If a server stopped unexpectedly, it may leave behind an entry in the socket manager. In order to launch a new server, one needs to clean out the socket manager entry if it is there. To do that, "tsockman remove <entry>" is called to remove "<entry>" before the new server is launched. If there is no entry, tsockman will respond with "Process by that name does not exist". THIS IS NORMAL and is not indicative of an error. The server in question launched (without fanfare) right after that output text.\\ \\  Here are the available bootlaunch scripts as of June 2017:\\ \\  gps computer:\\ \\ 
  
Line 12: Line 12:
   * bootlaunch_upper -- Starts the E1_Upper, E2_Upper, S1_Upper, S2_Upper, W1_Upper, or W2_Upper server, depending on the machine it's launched from.   * bootlaunch_upper -- Starts the E1_Upper, E2_Upper, S1_Upper, S2_Upper, W1_Upper, or W2_Upper server, depending on the machine it's launched from.
 \\  Note: The bootlaunch scripts will not start a new server if there is an existing process running. Therefore, type "ps aux | grep //server_name//" where server_name is the name of the server. If there is a dead process running, look up the process identification number (PID) and type "kill -9 PID" to kill the process and then run the relevant bootlaunch script again.\\ \\  \\  Note: The bootlaunch scripts will not start a new server if there is an existing process running. Therefore, type "ps aux | grep //server_name//" where server_name is the name of the server. If there is a dead process running, look up the process identification number (PID) and type "kill -9 PID" to kill the process and then run the relevant bootlaunch script again.\\ \\ 
-====== Restarting Servers using the rc.local file======+====== Restarting Servers using the rc.local file ======
 \\  This procedure is applicable to servers that have not switched over to the bootlaunch paradigm.\\ \\  Sometimes servers get locked up, crash, or stop working in some way and need to be restarted. The socket manager is no longer used to restart them, so it is necessary to restart them manually using a terminal. Here's how:\\ \\  1. Open a terminal window and connect to the computer that runs the server. These are listed at the bottom of this page. For example, to connect to the S1 telescope control computer type "s1".\\ \\  2. See if there is a process running on the machine. For example, if we were concerned about the RPC (remote Power Control) server on s1 we would use (**NOTE: The telescope RPC servers are no longer started in this way, see bootlaunch command above - these instructions are still relevant for servers not running under the bootlaunch paradigm**):\\ \\  s1:1001) ps aux | grep rpc_server\\  theo 15219 0.0 0.0 61188 752 pts/6 S+ 13:56 0:00 grep rpc_server\\ \\  So there is one running and its PID (process ID) is 15219\\ \\  3. Try and stop this gracefully by using "kill -2 15219"\\ \\  4. Redo the "ps aux | grep ....." command to see if that worked. If the PID is gone, the shut down was successful. If not force it to stop with "kill -9 15219".\\ \\  5. Make sure there is nothing left in the socket manager by using the GUI or the text command\\ \\  tscokman rm RPC_S1\\ \\  6. Find the restart command in the file /etc/rc.local or in the text files on the desktop.\\ \\  (s1:1003) grep rpc_server rc.local\\  /usr/local/bin/rpc_server /dev/ttyC1 /ctrscrut/chara/etc/s1_scope.cfg >& /var/log/rpc.log &\\ \\  7. Use that command to restart the server, but remove everything from the ">&" onwards and the > symbol:\\ \\  /usr/local/bin/rpc_server /dev/ttyC1 /ctrscrut/chara/etc/s1_scope.cfg &\\ \\  8. Sometimes it may complain about a lockfile. This can be removed with the command\\ \\  (s1:1003) rm /var/lock/LCK..ttyC1\\ \\  and you may have to remove the socket manger entry again.\\ \\  Some servers have to run as root, for example the wave front sensors, but most are best run as observe.\\ \\  \\  This procedure is applicable to servers that have not switched over to the bootlaunch paradigm.\\ \\  Sometimes servers get locked up, crash, or stop working in some way and need to be restarted. The socket manager is no longer used to restart them, so it is necessary to restart them manually using a terminal. Here's how:\\ \\  1. Open a terminal window and connect to the computer that runs the server. These are listed at the bottom of this page. For example, to connect to the S1 telescope control computer type "s1".\\ \\  2. See if there is a process running on the machine. For example, if we were concerned about the RPC (remote Power Control) server on s1 we would use (**NOTE: The telescope RPC servers are no longer started in this way, see bootlaunch command above - these instructions are still relevant for servers not running under the bootlaunch paradigm**):\\ \\  s1:1001) ps aux | grep rpc_server\\  theo 15219 0.0 0.0 61188 752 pts/6 S+ 13:56 0:00 grep rpc_server\\ \\  So there is one running and its PID (process ID) is 15219\\ \\  3. Try and stop this gracefully by using "kill -2 15219"\\ \\  4. Redo the "ps aux | grep ....." command to see if that worked. If the PID is gone, the shut down was successful. If not force it to stop with "kill -9 15219".\\ \\  5. Make sure there is nothing left in the socket manager by using the GUI or the text command\\ \\  tscokman rm RPC_S1\\ \\  6. Find the restart command in the file /etc/rc.local or in the text files on the desktop.\\ \\  (s1:1003) grep rpc_server rc.local\\  /usr/local/bin/rpc_server /dev/ttyC1 /ctrscrut/chara/etc/s1_scope.cfg >& /var/log/rpc.log &\\ \\  7. Use that command to restart the server, but remove everything from the ">&" onwards and the > symbol:\\ \\  /usr/local/bin/rpc_server /dev/ttyC1 /ctrscrut/chara/etc/s1_scope.cfg &\\ \\  8. Sometimes it may complain about a lockfile. This can be removed with the command\\ \\  (s1:1003) rm /var/lock/LCK..ttyC1\\ \\  and you may have to remove the socket manger entry again.\\ \\  Some servers have to run as root, for example the wave front sensors, but most are best run as observe.\\ \\ 
 ===== The following servers run on CTRSCRUT. ===== ===== The following servers run on CTRSCRUT. =====
chara/restarting_servers.1498782832.txt.gz ยท Last modified: 2017/06/29 20:33 by 127.0.0.1