User Tools

Site Tools


chara:trouble_shooting

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
chara:trouble_shooting [2018/09/25 18:57]
gail_stargazer
chara:trouble_shooting [2020/01/28 03:02]
charaobs
Line 9: Line 9:
  
   * Were the clocks synced? Make sure the [SYNC CLOCKS] button on Cosmic Debris has been pushed to start the night. If the OPLE server does not display the correct CHARA time and the errors don't read (0) or (1), the clocks were not synced. As the VME runs on its own clock and does not use the NTP server, it can drift over the course of the night and cause problems with finding fringes, even if it was synced at the start of the night. Syncing the clocks multiple time during the night may help to avoid this hidden clock problem.   * Were the clocks synced? Make sure the [SYNC CLOCKS] button on Cosmic Debris has been pushed to start the night. If the OPLE server does not display the correct CHARA time and the errors don't read (0) or (1), the clocks were not synced. As the VME runs on its own clock and does not use the NTP server, it can drift over the course of the night and cause problems with finding fringes, even if it was synced at the start of the night. Syncing the clocks multiple time during the night may help to avoid this hidden clock problem.
-  * Did the Astrolib update on OPLE? If the job queue is stopped too soon after slewing on Cosmic Debris, the correct calculations for the carts will not be done by OPLE and you may be searching for fringes with the wrong star data. The proper star can be entered by typing hd …. into the OPLE server and hitting ENTER.+  * Did the Astrolib update on OPLE? If the job queue is stopped too soon after slewing on Cosmic Debris, the correct calculations for the carts will not be done by OPLE and you may be searching for fringes with the wrong star data. Hit STAR ACQUIRED on CD to update ople.  The proper star can also be entered by typing hd #### into the OPLE server and hitting ENTER
 +  * Are the PoP's correct?  After a PoP change, the PoP's are sometimes not updated in CD or ople. 
 +  * Are the carts behaving or are there vibrations or jumps of 100 or more microns every 3-6 seconds?  Restart the ople server if they persist.
   * Is the target a high proper motion star? Red dwarfs are close stars and can have high proper motions. Scan a wider range to see if it is outside of the usual calculated scan range. Binaries can also have very high offsets from the expected position due to mistakenly using astromod calculations from the companion star.   * Is the target a high proper motion star? Red dwarfs are close stars and can have high proper motions. Scan a wider range to see if it is outside of the usual calculated scan range. Binaries can also have very high offsets from the expected position due to mistakenly using astromod calculations from the companion star.
 +  * Do you have enough flux from each telescope or on each baseline?
   * Did you get the same star in each telescope? Sometimes a busy star field and poor pointing of the telescopes can lead to the wrong star being acquired and locked by tiptilt. View the stars in the finder window to see if all the stars match.   * Did you get the same star in each telescope? Sometimes a busy star field and poor pointing of the telescopes can lead to the wrong star being acquired and locked by tiptilt. View the stars in the finder window to see if all the stars match.
   * Check the CHARA time on the GPS server. The "Ext-CHARA," "CHARA-Sys," and "Ext-Sys" time offsets listed on the GPS server should be small (< 0.01 sec). If there are large time offsets, then a GSYNC might be needed.   * Check the CHARA time on the GPS server. The "Ext-CHARA," "CHARA-Sys," and "Ext-Sys" time offsets listed on the GPS server should be small (< 0.01 sec). If there are large time offsets, then a GSYNC might be needed.
Line 28: Line 31:
 ==== Restarting Servers using the bootlaunch paradigm ==== ==== Restarting Servers using the bootlaunch paradigm ====
  
- \\ A number of servers use an interim bootlaunch paradigm to restart. This is confined to servers that run on ubuntu machines, namely the telescope bunker computers and gps. The basic syntax is "bootlaunch_<server>" where "<server>" is replaced by the server the script is designed to address. The scripts have a number of safeties built in, so it is safe to run them even if a server is already running – they just output the process ID of the running server. The scripts also take care of the entry in socket manager as well any serial port lock files. All the pertinent information is world writeable, so one should be able to run a bootlaunch script as observe. \\  \\ One thing of note about the output of the bootlaunch scripts, they call a number of other programs which themselves have output that may be misleading in the context of bootlaunch. Chief among these is the output of "tsockman". If a server stopped unexpectedly, it may leave behind an entry in the socket manager. In order to launch a new server, one needs to clean out the socket manager entry if it is there. To do that, "tsockman remove <entry>" is called to remove "<entry>" before the new server is launched. If there is no entry, tsockman will respond with "Process by that name does not exist". THIS IS NORMAL and is not indicative of an error. The server in question launched (without fanfare) right after that output text. \\  \\ Here are the available bootlaunch scripts as of June 2017: \\  \\ gps computer:+If a server is not running or Socket Manager reports that a server is dead, then look at the socket manager list to find out what computer the server runs on ([[:chara:socket_manager_list_file|socket_manager.list]]). You can also look at the up-to-date file by opening a terminal window and typing "less /ctrscrut/chara/etc/socket_manager/socket_manager.list"
 + 
 +To restart a server, log on to the machine that runs the server and type "bootlaunch_master". This script will go through the list of executables and will check which servers are running. If a server isn't running it bootlaunch_master will remove it from socket manager, clear the lock file, and relaunch the server. The instructions below describe how to restart individual servers, but this should not be necessary anymore. \\  \\ A number of servers use an interim bootlaunch paradigm to restart. This is confined to servers that run on ubuntu machines, namely the telescope bunker computers and gps. The basic syntax is "bootlaunch_<server>" where "<server>" is replaced by the server the script is designed to address. The scripts have a number of safeties built in, so it is safe to run them even if a server is already running – they just output the process ID of the running server. The scripts also take care of the entry in socket manager as well any serial port lock files. All the pertinent information is world writeable, so one should be able to run a bootlaunch script as observe. \\  \\ One thing of note about the output of the bootlaunch scripts, they call a number of other programs which themselves have output that may be misleading in the context of bootlaunch. Chief among these is the output of "tsockman". If a server stopped unexpectedly, it may leave behind an entry in the socket manager. In order to launch a new server, one needs to clean out the socket manager entry if it is there. To do that, "tsockman remove <entry>" is called to remove "<entry>" before the new server is launched. If there is no entry, tsockman will respond with "Process by that name does not exist". THIS IS NORMAL and is not indicative of an error. The server in question launched (without fanfare) right after that output text. \\  \\ Here are the available bootlaunch scripts as of June 2017: \\  \\ gps computer:
  
   * bootlaunch_beamsamp – Starts the beam sampler servers, BS1 and BS2.   * bootlaunch_beamsamp – Starts the beam sampler servers, BS1 and BS2.
Line 45: Line 50:
 ==== Restarting Servers using the rc.local file ==== ==== Restarting Servers using the rc.local file ====
  
- \\ This procedure is applicable to servers that have not switched over to the bootlaunch paradigm. \\  \\ If a server is not running or Socket Manager reports that a server is dead, then look at the socket manager list to find out what computer the server runs on ([[:chara:socket_manager_list_file|socket_manager.list]]). You can also look at the up-to-date file by opening a terminal window and typing "less /chara/observe/socket_manager.list" Note that servers can be running fine, but if the Socket Manager drops the connection to them, they are as good as dead when it comes to functioning with other servers or as part of a larger sequence. \\+ \\ This procedure is applicable to servers that have not switched over to the bootlaunch paradigm. \\  \\ If a server is not running or Socket Manager reports that a server is dead, then look at the socket manager list to find out what computer the server runs on ([[:chara:socket_manager_list_file|socket_manager.list]]). You can also look at the up-to-date file by opening a terminal window and typing "less /ctrscrut/chara/etc/socket_manager/socket_manager.list" Note that servers can be running fine, but if the Socket Manager drops the connection to them, they are as good as dead when it comes to functioning with other servers or as part of a larger sequence. \\
  \\  \\
 Log on to the relevant computer by typing the computer name (ctrscrut, ople, s1, …). If the shortcut doesn't work then type "ssh //name//" where name is the computer name. \\  \\ Find out if the server is running by typing "ps aux | grep //server_name//" where server_name is the name of the server. \\ [ctrscrut:599] ps aux | grep pico_1 \\ observe 9281 0.0 0.0 61156 692 pts/3 S+ 13:58 0:00 grep pico_1 \\ observe 12578 0.0 0.0 24524 11212 ? S Jun16 33:14 /usr/local/bin/pico_server /dev/ttyC8 /ctrscrut/chara/etc/pico_1.cfg \\  \\ If the entry for the dead server shows up in the process list, then identify the process identification number (12578 for the example above) and kill the server by typing "kill -9 //PID//" where PID is the process identification number. \\  \\ Look up the commands to restart the server by typing "more /etc/rc.local" (this is relevant for servers that run in the background). Press the space bar to scroll through the contents of the rc.local file. Locate the commands relevant for the server that needs to be restarted and copy and paste into a terminal window: \\  \\ #Start PICO server for PICO #1 \\ /bin/rm -f /var/lock/LCK..ttyC8 \\ /usr/local/bin/tsockman remove PICO_1 \\ /usr/local/bin/pico_server /dev/ttyC8 /ctrscrut/chara/etc/pico_1.cfg & \\  \\ The first command removes the lock to allow the server to restart. The second command removes the name from the socket manager listing. The last command restarts the server. Note that if you are restarting the servers as observe, you will need to remove the part of the command in the rc.local file that saves information in /var/log///server_nam//e.log file (the actual command typed should resemble the last line above). \\  \\ There are text files on the desktop with many of the restart commands. Use these files for quick access to the relevant commands. The commands are edited and can be copied exactly as written. Files include Dome servers and all servers running on ctrscrut. Many of these commands are also located on the [[:chara:restarting_servers|Restarting Servers]] page. Log on to the relevant computer by typing the computer name (ctrscrut, ople, s1, …). If the shortcut doesn't work then type "ssh //name//" where name is the computer name. \\  \\ Find out if the server is running by typing "ps aux | grep //server_name//" where server_name is the name of the server. \\ [ctrscrut:599] ps aux | grep pico_1 \\ observe 9281 0.0 0.0 61156 692 pts/3 S+ 13:58 0:00 grep pico_1 \\ observe 12578 0.0 0.0 24524 11212 ? S Jun16 33:14 /usr/local/bin/pico_server /dev/ttyC8 /ctrscrut/chara/etc/pico_1.cfg \\  \\ If the entry for the dead server shows up in the process list, then identify the process identification number (12578 for the example above) and kill the server by typing "kill -9 //PID//" where PID is the process identification number. \\  \\ Look up the commands to restart the server by typing "more /etc/rc.local" (this is relevant for servers that run in the background). Press the space bar to scroll through the contents of the rc.local file. Locate the commands relevant for the server that needs to be restarted and copy and paste into a terminal window: \\  \\ #Start PICO server for PICO #1 \\ /bin/rm -f /var/lock/LCK..ttyC8 \\ /usr/local/bin/tsockman remove PICO_1 \\ /usr/local/bin/pico_server /dev/ttyC8 /ctrscrut/chara/etc/pico_1.cfg & \\  \\ The first command removes the lock to allow the server to restart. The second command removes the name from the socket manager listing. The last command restarts the server. Note that if you are restarting the servers as observe, you will need to remove the part of the command in the rc.local file that saves information in /var/log///server_nam//e.log file (the actual command typed should resemble the last line above). \\  \\ There are text files on the desktop with many of the restart commands. Use these files for quick access to the relevant commands. The commands are edited and can be copied exactly as written. Files include Dome servers and all servers running on ctrscrut. Many of these commands are also located on the [[:chara:restarting_servers|Restarting Servers]] page.
chara/trouble_shooting.txt · Last modified: 2023/11/21 01:42 by charaobs