User Tools

Site Tools


chara:trouble_shooting

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
chara:trouble_shooting [2018/09/25 15:35]
gail_stargazer
chara:trouble_shooting [2020/01/23 14:08]
jones
Line 28: Line 28:
 ==== Restarting Servers using the bootlaunch paradigm ==== ==== Restarting Servers using the bootlaunch paradigm ====
  
- \\ A number of servers use an interim bootlaunch paradigm to restart. This is confined to servers that run on ubuntu machines, namely the telescope bunker computers and gps. The basic syntax is "bootlaunch_<server>" where "<server>" is replaced by the server the script is designed to address. The scripts have a number of safeties built in, so it is safe to run them even if a server is already running – they just output the process ID of the running server. The scripts also take care of the entry in socket manager as well any serial port lock files. All the pertinent information is world writeable, so one should be able to run a bootlaunch script as observe. \\  \\ One thing of note about the output of the bootlaunch scripts, they call a number of other programs which themselves have output that may be misleading in the context of bootlaunch. Chief among these is the output of "tsockman". If a server stopped unexpectedly, it may leave behind an entry in the socket manager. In order to launch a new server, one needs to clean out the socket manager entry if it is there. To do that, "tsockman remove <entry>" is called to remove "<entry>" before the new server is launched. If there is no entry, tsockman will respond with "Process by that name does not exist". THIS IS NORMAL and is not indicative of an error. The server in question launched (without fanfare) right after that output text. \\  \\ Here are the available bootlaunch scripts as of June 2017: \\  \\ gps computer:+If a server is not running or Socket Manager reports that a server is dead, then look at the socket manager list to find out what computer the server runs on ([[:chara:socket_manager_list_file|socket_manager.list]]). You can also look at the up-to-date file by opening a terminal window and typing "less /ctrscrut/chara/etc/socket_manager/socket_manager.list"
 + 
 +To restart a server, log on to the machine that runs the server and type "bootlaunch_master". This script will go through the list of executables and will check which servers are running. If a server isn't running it bootlaucn_master will remove it from socket manager, clear the lock file, and relaunch the server. The instructions below describe how to restart individual servers, but this should not be necessary anymore. \\  \\ A number of servers use an interim bootlaunch paradigm to restart. This is confined to servers that run on ubuntu machines, namely the telescope bunker computers and gps. The basic syntax is "bootlaunch_<server>" where "<server>" is replaced by the server the script is designed to address. The scripts have a number of safeties built in, so it is safe to run them even if a server is already running – they just output the process ID of the running server. The scripts also take care of the entry in socket manager as well any serial port lock files. All the pertinent information is world writeable, so one should be able to run a bootlaunch script as observe. \\  \\ One thing of note about the output of the bootlaunch scripts, they call a number of other programs which themselves have output that may be misleading in the context of bootlaunch. Chief among these is the output of "tsockman". If a server stopped unexpectedly, it may leave behind an entry in the socket manager. In order to launch a new server, one needs to clean out the socket manager entry if it is there. To do that, "tsockman remove <entry>" is called to remove "<entry>" before the new server is launched. If there is no entry, tsockman will respond with "Process by that name does not exist". THIS IS NORMAL and is not indicative of an error. The server in question launched (without fanfare) right after that output text. \\  \\ Here are the available bootlaunch scripts as of June 2017: \\  \\ gps computer:
  
   * bootlaunch_beamsamp – Starts the beam sampler servers, BS1 and BS2.   * bootlaunch_beamsamp – Starts the beam sampler servers, BS1 and BS2.
Line 45: Line 47:
 ==== Restarting Servers using the rc.local file ==== ==== Restarting Servers using the rc.local file ====
  
- \\ This procedure is applicable to servers that have not switched over to the bootlaunch paradigm. \\  \\ If a server is not running or Socket Manager reports that a server is dead, then look at the socket manager list to find out what computer the server runs on ([[:chara:socket_manager_list_file|socket_manager.list]]). You can also look at the up-to-date file by opening a terminal window and typing "less /chara/observe/socket_manager.list" Note that servers can be running fine, but if the Socket Manager drops the connection to them, they are as good as dead when it comes to functioning with other servers or as part of a larger sequence. \\+ \\ This procedure is applicable to servers that have not switched over to the bootlaunch paradigm. \\  \\ If a server is not running or Socket Manager reports that a server is dead, then look at the socket manager list to find out what computer the server runs on ([[:chara:socket_manager_list_file|socket_manager.list]]). You can also look at the up-to-date file by opening a terminal window and typing "less /ctrscrut/chara/etc/socket_manager/socket_manager.list" Note that servers can be running fine, but if the Socket Manager drops the connection to them, they are as good as dead when it comes to functioning with other servers or as part of a larger sequence. \\
  \\  \\
 Log on to the relevant computer by typing the computer name (ctrscrut, ople, s1, …). If the shortcut doesn't work then type "ssh //name//" where name is the computer name. \\  \\ Find out if the server is running by typing "ps aux | grep //server_name//" where server_name is the name of the server. \\ [ctrscrut:599] ps aux | grep pico_1 \\ observe 9281 0.0 0.0 61156 692 pts/3 S+ 13:58 0:00 grep pico_1 \\ observe 12578 0.0 0.0 24524 11212 ? S Jun16 33:14 /usr/local/bin/pico_server /dev/ttyC8 /ctrscrut/chara/etc/pico_1.cfg \\  \\ If the entry for the dead server shows up in the process list, then identify the process identification number (12578 for the example above) and kill the server by typing "kill -9 //PID//" where PID is the process identification number. \\  \\ Look up the commands to restart the server by typing "more /etc/rc.local" (this is relevant for servers that run in the background). Press the space bar to scroll through the contents of the rc.local file. Locate the commands relevant for the server that needs to be restarted and copy and paste into a terminal window: \\  \\ #Start PICO server for PICO #1 \\ /bin/rm -f /var/lock/LCK..ttyC8 \\ /usr/local/bin/tsockman remove PICO_1 \\ /usr/local/bin/pico_server /dev/ttyC8 /ctrscrut/chara/etc/pico_1.cfg & \\  \\ The first command removes the lock to allow the server to restart. The second command removes the name from the socket manager listing. The last command restarts the server. Note that if you are restarting the servers as observe, you will need to remove the part of the command in the rc.local file that saves information in /var/log///server_nam//e.log file (the actual command typed should resemble the last line above). \\  \\ There are text files on the desktop with many of the restart commands. Use these files for quick access to the relevant commands. The commands are edited and can be copied exactly as written. Files include Dome servers and all servers running on ctrscrut. Many of these commands are also located on the [[:chara:restarting_servers|Restarting Servers]] page. Log on to the relevant computer by typing the computer name (ctrscrut, ople, s1, …). If the shortcut doesn't work then type "ssh //name//" where name is the computer name. \\  \\ Find out if the server is running by typing "ps aux | grep //server_name//" where server_name is the name of the server. \\ [ctrscrut:599] ps aux | grep pico_1 \\ observe 9281 0.0 0.0 61156 692 pts/3 S+ 13:58 0:00 grep pico_1 \\ observe 12578 0.0 0.0 24524 11212 ? S Jun16 33:14 /usr/local/bin/pico_server /dev/ttyC8 /ctrscrut/chara/etc/pico_1.cfg \\  \\ If the entry for the dead server shows up in the process list, then identify the process identification number (12578 for the example above) and kill the server by typing "kill -9 //PID//" where PID is the process identification number. \\  \\ Look up the commands to restart the server by typing "more /etc/rc.local" (this is relevant for servers that run in the background). Press the space bar to scroll through the contents of the rc.local file. Locate the commands relevant for the server that needs to be restarted and copy and paste into a terminal window: \\  \\ #Start PICO server for PICO #1 \\ /bin/rm -f /var/lock/LCK..ttyC8 \\ /usr/local/bin/tsockman remove PICO_1 \\ /usr/local/bin/pico_server /dev/ttyC8 /ctrscrut/chara/etc/pico_1.cfg & \\  \\ The first command removes the lock to allow the server to restart. The second command removes the name from the socket manager listing. The last command restarts the server. Note that if you are restarting the servers as observe, you will need to remove the part of the command in the rc.local file that saves information in /var/log///server_nam//e.log file (the actual command typed should resemble the last line above). \\  \\ There are text files on the desktop with many of the restart commands. Use these files for quick access to the relevant commands. The commands are edited and can be copied exactly as written. Files include Dome servers and all servers running on ctrscrut. Many of these commands are also located on the [[:chara:restarting_servers|Restarting Servers]] page.
Line 59: Line 61:
 ==== The Telescope won't move or stopped moving ==== ==== The Telescope won't move or stopped moving ====
  
- \\ Have the powers been turned on to the drives? Are the scopes disabled? The usual state of the telescopes is disabled until enabled. This is due to the stall function of the scopes which eventually disables the scopes when they are stowed. Enable the scopes by hitting [ENABLE] on the dome gui or in the telescope gui control tab. If a scope disables itself during a slew, it may be just an overcautious stall function. Hit ENABLE and the scope should continue to slew. If it disables again without moving, there may be something wrong at the scope, ie. a hatch left open, a ladder not put away, a tool on the floor, or something else physically impeding the motion of the scope. You will need to go to the dome to see what it is. The computer in the dome will give you control of the scope to turn it away from the problem. \\  \\ Sometimes the dome guis get hung up and can cause erratic motion or no motion of the scopes. Check them for current times and continuous updates of numbers. If they are not updating, try to REOPEN them first. If that does not work, close the gui and open a new one. If a new one does not open, the dome server may be dead or Sockman lost track of it. See **Dome Server Restart**  below. \\  \\ **Azimuth Limit Switches** \\  \\ As of 11-'17, the azimuth limit switches are enabled and can stop the motion of the scopes if they try to go beyond -90º or +450º. The scopes will not be movable with normal inputs so follow these instructions to return them from the out of range condition. \\  \\ 1. On the domegui MANUAL tab, click STOP so pulses won't be sent to the drive by the control software. \\  \\ 2. Make sure you understand why the limit was hit which may require a trip to the telescope. \\  \\ 3. Click the OVERRIDE ON button in domegui MANUAL tab. After this, the hardware doesn't care about the limits switches and you're free to move the telescope. \\  \\ 4. Click ENABLE then you can move the telescope back to its normal range of operation. \\  \\ 5. After the telescope is back in it normal range, click OVERRIDE OFF which makes the hardware aware of the limits again. \\+ \\ Have the powers been turned on to the drives? Are the scopes disabled? The usual state of the telescopes is disabled until enabled. This is due to the stall function of the scopes which eventually disables the scopes when they are stowed. Enable the scopes by hitting [ENABLE] on the dome gui or in the telescope gui control tab. If a scope disables itself during a slew, it may be just an overcautious stall function. Hit ENABLE and the scope should continue to slew. If it disables again without moving, there may be something wrong at the scope, ie. a hatch left open, a ladder not put away, a tool on the floor, or something else physically impeding the motion of the scope. You will need to go to the dome to see what it is. The computer in the dome will give you control of the scope to turn it away from the problem. \\  \\ Sometimes the dome guis get hung up and can cause erratic motion or no motion of the scopes. Check them for current times and continuous updates of numbers. If they are not updating, try to REOPEN them first. If that does not work, close the gui and open a new one. If a new one does not open, the dome server may be dead or Sockman lost track of it. See **Dome Server Restart**  below. \\  \\ **Azimuth Limit Switches** \\  \\ As of 11-'17, the azimuth limit switches are enabled and can stop the motion of the scopes if they try to go beyond -90º or +450º. The scopes will not be movable with normal inputs so follow these instructions to return them from the out of range condition. \\  \\ 1. On the domegui MANUAL tab, click STOP so pulses won't be sent to the drive by the control software. \\  \\ 2. Make sure you understand why the limit was hit which may require a trip to the telescope. \\  \\ 3. Click the OVERRIDE ON button in domegui MANUAL tab. After this, the hardware doesn't care about the limits switches and you're free to move the telescope. \\  \\ 4. Click ENABLE then you can move the telescope back to its normal range of operation. \\  \\ 5. After the telescope is back in it normal range, click OVERRIDE OFF which makes the hardware aware of the limits again.
  
 ==== The Telescope won't track ==== ==== The Telescope won't track ====
Line 273: Line 275:
 ==== Adding or Finding a Star in the CHARA Database ==== ==== Adding or Finding a Star in the CHARA Database ====
  
-DBADD is a command entered in a terminal window that allows you to look up the CHARA number for a star or add a star to the CHARA database. To use dbadd, open a terminal window and type "dbadd starname" where starname is the common name (e.g., Vega) or identifier of the star (e.g., IRC, HR, HD, SAO, FK5, HIP, GJ, or 2MASS designation). If the object is in the database, then it will return the CHARA number.+The command "dbadd" can be used to look up the CHARA number for a star or to add a star to the CHARA database. To use dbadd, open a terminal windowand type "dbadd starname" where starname is the common name (e.g., Vega) or identifier of the star (e.g., IRC, HR, HD, SAO, FK5, HIP, GJ, or 2MASS designation). If the object is in the database, then it will return the CHARA number.
  
 You can also look up different identifiers for the star by using the SIMBAD database: [[http://simbad.u-strasbg.fr/simbad/sim-fbasic|http://simbad.u-strasbg.fr/simbad/sim-fbasic]] \\ You can also look up different identifiers for the star by using the SIMBAD database: [[http://simbad.u-strasbg.fr/simbad/sim-fbasic|http://simbad.u-strasbg.fr/simbad/sim-fbasic]] \\
Line 299: Line 301:
 Is the above the object you are looking for? [y/n] y \\ Added "BD+41 3807" to the database. Is the above the object you are looking for? [y/n] y \\ Added "BD+41 3807" to the database.
  
-The object "BD+41 3807" has been added to the database as CHARA number 320414.+In the example above, the object "BD+41 3807" has been added to the database as CHARA number 320414
 + 
 + \\ __Targets that are not in SIMBAD__ \\  \\ Some objects like recently discovered novae are not in SIMBAD. Entering dbadd <star designation> for a star not in the SIMBAD database returns the message "Simbad is unable to find an object matching <star designation>. Try using a different catalog designation, or use the "-m" switch to add the object manually." \\  \\ The command dbadd -m is used by entering the star name and coordinates in this format: \\  \\ Usage: dbadd -m <name> <RA> <Dec> \\  \\ <name>: Object ID (no spaces) \\  \\ <RA> : XXhXXmXX.X or XXhXX.X (no spaces) \\  \\ <Dec> : XXdXXmXX.X or XXdXX.X (no spaces) \\  \\ There are cases where an object is not in SIMBAD and doesn't return a result in dbadd, but is in fact in the CHARA database. Novae and AGN's are the likely objects that cause this result. At times, objects have multiple entries and several CHARA numbers since the names can be so unique. A system will be developed to find these entries without knowing or having a conventional database designation. \\  \\ __Binary stars__ \\  \\ Some binary stars have a common HD number with an A or B after them. These can cause problems as Cosmic Debris does not accept non numerical entries when filling in star designation. These stars are likely in CHARA's database but need to be searched in dbadd or on SIMBAD to get the CHARA number or another designation.
  
- \\ __Targets that are not in the CHARA database or in SIMBAD__ \\  \\ Some objects like recently discovered novae are not in SIMBAD. Entering dbadd <star designation> for a star not in the SIMBAD database returns the message "Simbad is unable to find an object matching <star designation>. Try using a different catalog designation, or use the "-m" switch to add the object manually." \\  \\ The command dbadd -m is used by entering the star name and coordinates in this format: \\  \\ Usage: dbadd -m <name> <RA> <Dec> \\  \\ <name>: Object ID (no spaces) \\  \\ <RA> : XXhXXmXX.X or XXhXX.X (no spaces) \\  \\ <Dec> : XXdXXmXX.X or XXdXX.X (no spaces) \\  \\ There are cases where an object is not in SIMBAD and doesn't return a result in dbadd, but is in fact in the CHARA database. Novae and AGN's are the likely objects that cause this result. At times, objects have multiple entries and several CHARA numbers since the names can be so unique. A system will be developed to find these entries without knowing or having a conventional database designation. \\  \\ __Binary stars__ \\  \\ Some binary stars have a common HD number with an A or B after them. These can cause problems as Cosmic Debris does not accept non numerical entries when filling in star designation. These stars are likely in CHARA's database but need to be searched in dbadd or on SIMBAD to get the CHARA number or another designation. \\ When a recognized identifier is entered, the HD number with A or B will usually display. Confirm that the coordinates, magnitudes and spectral type match the star desired. If they do not match, you may have the wrong star or the database may have the wrong info and the baseline solution will be affected. Offsets based on incorrect coordinates or misidentifications can move the fringes many thousands of microns away from the calculated position. This can happen when observing a dim companion (B) using the brighter (A) companion's coordinates. Inform the observers that noting and using the CHARA number will save time next time the target is observed. \\  \\ __Editing the database__ \\  \\ If you find a mistake in the database, please send an email to Nils to have it corrected. Identify what you believe to be the error and what is the correct information.+When a recognized identifier is entered, the HD number with A or B will usually display. Confirm that the coordinates, magnitudes and spectral type match the star desired. If they do not match, you may have the wrong star or the database may have the wrong info and the baseline solution will be affected. Offsets based on incorrect coordinates or misidentifications can move the fringes many thousands of microns away from the calculated position. This can happen when observing a dim companion (B) using the brighter (A) companion's coordinates. Inform the observers that noting and using the CHARA number will save time next time the target is observed. \\  \\ __Editing the database__ \\  \\ If you find a mistake in the database, please send an email to Nils to have it corrected. Identify what you believe to be the error and what is the correct information.
  
 ==== Adding an anchor to the wiki page ==== ==== Adding an anchor to the wiki page ====
chara/trouble_shooting.txt · Last modified: 2024/06/18 00:21 by charaobs